I am having an issue that I almost solved thanks to this post. Using a dataset in the same format:
File 1
32074_32077 1 0.008348 834830 G A
32082_32085 1 0.008349 834928 A G
32085_32088 2 0.008350 834928 G A
32903_32906 5 0.008468 846808 C T
File 2
rs3094315 1 0.020130 752566 G A
rs12124819 1 0.020242 834928 A G
rs28765502 2 0.022137 834928 T C
rs7419119 3 0.022518 846808 T G
I would like to change the 1st column of file one only IF $4 and $2 are the same in FILE2. If it is not I would like to keep the line as it is.
Expected output:
32074_32077 1 0.008348 834830 G A
rs12124819 1 0.008349 834928 A G
rs28765502 2 0.008350 834928 G A
32903_32906 5 0.008468 846808 C T
Using the answer from the linked post, I cannot have the expected output. I tried this:
awk 'FNR==NR{a[$4]=$1; b[$2]=$1; next} ($4 in a && $2 in b){$1=a[$4]} 1' file1 file2
It doesn't work as expected because the condition $2 in b
is always true.. I understand but I don't know how I can work around this.
Thank you.
Aucun commentaire:
Enregistrer un commentaire