mardi 21 janvier 2020

awk syntax error in utilizing a shell variable with if/then statement

I'm trying to get the following code to work but I keep on getting syntax errors in the awk portion of the script.

Briefly, I want to calculate a cutoff value and store it as a floating decimal in a numerical variable (e.g., cutoff). I want to pass this variable to the awk script which I try but still run into syntax problems with errors that state:

    awk: syntax error at source line 3
       context is
           >>> <<<

Here is the following sample sequences could have the first four lines:

>Spl-129_TTCAGTGG_80 CAGACATAGTCATCTATCAATACATaGATGATTTGTATGTAGGATCTGACTTAGAAATAGGGCAGCATAGAACAAAAATAGAGGAACTGAGACAACATCTGTTGAGGTGGGGATTTACCACACCAGACAAAAAACATCAGAAAGAACCTCCATTCCTTTGGATGGGTTATGAACTCCATCCTGATAAATGGACAGTACAGCCTATAGTGCTGCCAGAAAAGGACAGCTGGACTGTCAATGACATACAGA

>Spl-129_TGGGGACC_80 CAGACATAGTCATCTATCAATACATaGATGATTTGTATGTAGGATCTGACTTAGAAATAGGGCAGCATAGAACAAAAATAGAGGAACTGAGACAACATCTGTTGAGGTGGGGATTTACCACACCAGACAAAAAACATCAGAAAGAACCTCCATTCCTTTGGATGGGTTATGAACTCCATCCTGATAAATGGACAGTACAGCCTATAGTGCTGCCAGAAAAGGACAGCTGGACTGTCAATGACATACAGA

and now the code:

    for file in *fa; do
        name=`echo $file | cut -d'.' -f1`;
        awk 'BEGIN{RS=">"}NR>1{sub("\n","\t"); gsub("\n",""); print RS$0}' $file | tail -n+2 | sed 's/_/\t/g' >tmp;
        m=`cut -f3 tmp | sort -nr | head -n1`;
        cutoff=`echo "(-1.24*10^-21*$m^6)+(3.53*10^-17*$m^5)-(3.90*10^-13*$m^4)+(2.12*10^-9*$m^3)-(6.06*10^-6*$m^2)+(0.018*$m)+3.15" | bc`;
        echo "$name\t$cutoff";

        awk -v c="$cutoff" -v n="$name" '{ 
            if (c < 4) 
           awk '$3 > 2' tmp >n"_CUT.txt"; 

           else awk '$3 > c' tmp >n"_CUT.txt"; 
    }';
    done

Any help that you could provide would be much appreciated. Thanks!

Aucun commentaire:

Enregistrer un commentaire