I'm trying to get the following code to work but I keep on getting syntax errors in the awk portion of the script.
Briefly, I want to calculate a cutoff value and store it as a floating decimal in a numerical variable (e.g., cutoff). I want to pass this variable to the awk script which I try but still run into syntax problems with errors that state:
awk: syntax error at source line 3
context is
>>> <<<
Here is the following sample sequences could have the first four lines:
>Spl-129_TTCAGTGG_80 CAGACATAGTCATCTATCAATACATaGATGATTTGTATGTAGGATCTGACTTAGAAATAGGGCAGCATAGAACAAAAATAGAGGAACTGAGACAACATCTGTTGAGGTGGGGATTTACCACACCAGACAAAAAACATCAGAAAGAACCTCCATTCCTTTGGATGGGTTATGAACTCCATCCTGATAAATGGACAGTACAGCCTATAGTGCTGCCAGAAAAGGACAGCTGGACTGTCAATGACATACAGA
>Spl-129_TGGGGACC_80 CAGACATAGTCATCTATCAATACATaGATGATTTGTATGTAGGATCTGACTTAGAAATAGGGCAGCATAGAACAAAAATAGAGGAACTGAGACAACATCTGTTGAGGTGGGGATTTACCACACCAGACAAAAAACATCAGAAAGAACCTCCATTCCTTTGGATGGGTTATGAACTCCATCCTGATAAATGGACAGTACAGCCTATAGTGCTGCCAGAAAAGGACAGCTGGACTGTCAATGACATACAGA
and now the code:
for file in *fa; do
name=`echo $file | cut -d'.' -f1`;
awk 'BEGIN{RS=">"}NR>1{sub("\n","\t"); gsub("\n",""); print RS$0}' $file | tail -n+2 | sed 's/_/\t/g' >tmp;
m=`cut -f3 tmp | sort -nr | head -n1`;
cutoff=`echo "(-1.24*10^-21*$m^6)+(3.53*10^-17*$m^5)-(3.90*10^-13*$m^4)+(2.12*10^-9*$m^3)-(6.06*10^-6*$m^2)+(0.018*$m)+3.15" | bc`;
echo "$name\t$cutoff";
awk -v c="$cutoff" -v n="$name" '{
if (c < 4)
awk '$3 > 2' tmp >n"_CUT.txt";
else awk '$3 > c' tmp >n"_CUT.txt";
}';
done
Any help that you could provide would be much appreciated. Thanks!
Aucun commentaire:
Enregistrer un commentaire