I am subsettting data from an originally large dataset. I managed to select the number of columns I wanted for a new file from this original data, but then on trying to do a further selection based on an if statement (requiring column 28 of my new file to only include lines <=5000) my code does not seem to keep the tab field seperation and also removes the Header line from my data. I am new to using linux so any guidance would be appreciated.
awk 'BEGIN{FS="\t"} { for(i=125; i<=NF; ++i) printf $i""FS; print ""}' Bigfile.txt> Smallfile.txt
awk 'BEGIN{FS="\t"} {if($28<=5000) print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$28}' Smallfile.txt > Smallfile1.txt
The first awk line works fine and selects the 28 total columns I want from my original dataset. The second line does not then let me subset further. I have tried removing BEGIN, adding ; in places, and using -F"\t" instead of {FS="\t"}
Aucun commentaire:
Enregistrer un commentaire