mardi 29 août 2017

Another If-else-then statment in bash

I am running to a "weird" "if then else" problem (or I am just a novice). Or, I do not fully understand the semantics of specified statement.

What I want to do is read through a table (csv) like the one below

/home/trotos/I16_1505_09.fastq.gz,34,hg19
/home/trotos/I16_1505_06.fastq.gz,34,hg19
/home/trotos/I16_1505_12.fastq.gz,40,hg19
/home/trotos/I15_1277_01.fastq.gz,42,gg5
/home/trotos/I15_1458_01.fastq.gz,42,gg5
/home/trotos/I15_1314_01.fastq.gz,36,gg5
/home/trotos/I15_1458_03.fastq.gz,36,gg5

and then use input from each column (sequentially) to perform several commands. the script I am using (not refined yet) where file is the statedcsv

#!/bin/bash

shopt -s nullglob

#initialazation 


file="$3" #the file where you keep your variables
human="$1" #the path of the first genome
mouse="$2" #the path of the second genome

echo "START"    #for test purposes no need for them to be here
echo $human     #for test purposes no need for them to be here
echo $mouse     #for test purposes no need for them to be here
echo $file      #for test purposes no need for them to be here


#How to read columns from files and loop
#visit http://ift.tt/2x0Ux44

while IFS=, read col1 col2 col3 ; do  # the file need to be loaded at the end of the loop, check a done
    echo "LOOP"    
    echo $col1   #for test purposes no need for them to be here
    echo $col2   #for test purposes no need for them to be here
    echo $col3   #for test purposes no need for them to be here


#Loop into the file per line 
for i in $col1; do

#do some naming control to use as outputs
    base1=${i##*/}  # Get the file name from the path use this one for the following applications.
    NOEXT1=${base1%.*}  #leave the extension out
    NOEXT2=${NOEXT1%.*} #leave the second extension out
    FOLDER1=${col1%/*}

echo $base1
echo $NOEXT1 
echo $NOEXT2
echo $FOLDER1
echo "...."


#per line read the third column and decide the the genome to be used with an if statement
if [$col3="hg19"]
    then
       ref_genome=$human
echo "1"
echo $ref_genome
    else
        ref_genome=$mouse
echo $ref_genome
echo "2"

echo $ref_genome
fi

echo "....."

echo "END LOOP"


done
done < $file
echo "SCRIPT IS DONE"

the command is the following

./test '/home/trotos/Downloads/chromosomes/hg19.fa' '/home/trotos/Downloads/chromosomes/gg5.fa.gz' 'csv_file'

What it does correctly is to read the columns of the file, and get the data I need from a column per line in a loop.

But when I am using the third column (hg19 or gg5) as a statement of TRUE or FALSE to get a different condition:

if hg19 is TRUE then 'hg19.fa' is the correct

if gg5 is TRUE then 'gg5.fa.gz' is the correct but script's output differs:

LOOP
/home/trotos/lane2_I15_1458_08.fastq.gz
42
**gg5**
/home/trotos/lane2_I15_1458_08.fastq.gz
/home/trotos/lane2_I15_1458_08.fastq
/home/trotos/lane2_I15_1458_08

....
1
**/home/trotos/Downloads/chromosomes/hg19.fa**
.....
END LOOP

The first problem is that when col3==hg19 it gives the correct output that would be "/home/trotos/Downloads/chromosomes_hg19/hg19.fa". But when col3==gg5 i still get the same "/home/trotos/Downloads/chromosomes_hg19/hg19.fa". So who will I get the correct answer? The second is how to use the "if then else" statement to get the specific file that corresponds to the 3rd column and to use that information inside the loop defined by:

 for i in $col1; do.

Thank you in advance. I hope my description will not confuse you.

Aucun commentaire:

Enregistrer un commentaire