mercredi 16 novembre 2016

AWK for-loop with break statment

Today I am working on a problem correcting data errors in files that have a few unknowns. The unknowns are the number of fields in each file, and which fields and records have the string "---".

An example of the data is:

1  2  1    39.6406  1    38.8512  1    38.3479  1    37.9744
2  1  4    39.1527  3    38.7329  2    38.3479  2    37.9744
3  3  3    39.5186  2    38.8512  3    38.2079  3    37.6385
4  4  2    39.6406  4    38.4964  ---  37.7414  ---  36.7149
5  5  ---  40.2504  ---  39.0286  ---  38.4879  ---  38.1004

The desired output is:

1  2  1    39.6406  1    38.8512  1    38.3479  1    37.9744
2  1  4    39.1527  3    38.7329  2    38.3479  2    37.9744
3  3  3    39.5186  2    38.8512  3    38.2079  3    37.6385
4  4  2    39.6406  4    38.4964  ---  ---      ---  ---
5  5  ---  ---      ---  ---      ---  ---      ---  ---

I have tried using for-loops, such as:

awk '{for (i = NF; i >= 1; i--){if ($i=="---")$(i-1)="---"}{print $0}}' file 

which resulted in:

1    2  1  39.6406  1  38.8512  1  38.3479  1  37.9744
2    1  4  39.1527  3  38.7329  2  38.3479  2  37.9744
3    3  3  39.5186  2  38.8512  3  38.2079  3  37.6385
---
---

and I also tried:

awk '{for (i=1;i<=NF;i++){if ($i=="---")$(i+1)="---"}{print $0}}' file

which resulted in the error:

"awk: program limit exceeded: maximum number of fields size=32767"
    FILENAME="file" FNR=4 NR=4
1  2  1  39.6406  1  38.8512  1  38.3479  1  37.9744
2  1  4  39.1527  3  38.7329  2  38.3479  2  37.9744
3  3  3  39.5186  2  38.8512  3  38.2079  3  37.6385

In my first attempt, the for-loop went all the way to the first field, and in the second attempt, the records with the desired string had an infinite loop.

My gut feeling is I need to apply a break statement, yet after many hours of searching, I can't find an example that has helped me. I know there is more then one way to skin a cat, so if you know a better way to accomplish my goal, keeping in mind that there are multiple files with different field counts, or if you can provide an example of a break statement with one of my for-loops, I, and others looking for an example, will be extremely grateful.

Thank you

Aucun commentaire:

Enregistrer un commentaire