vendredi 22 juillet 2016

Nested if statements: Swapping headers and sequences in fasta files

I am opening a directory and processing each file. Two sample files look like:

Sample 1)

    >UVWXY
    ABCDEFGHIJKLMNOPQRSTUVWXYZ
    >STUVW
    ABCDEFGHIJKLMNOPQRSTUVWXYZ
    >QRSTU
    ABCDEFGHIJKLMNOPQRSTUVWXYZ

Sample 2)

    >CDEFG
    ABCDEFGHIJKLMNOPQRSTUVWXYZ

I am trying to turn these files into ones that look like:

Sample 1)

    >TUVWXYZ 
    UVWXY 
    >RSTUVWX 
    STUVW 
    >PQRSTUV 
    QRSTU

Sample 2)

    >BCDEFGH 
    CDEFG 

In other words, the "header" line of the fastas need to flip with the sequence that matches it, plus flanks of one letter on each side. I want to print each file's worth of contents to another separate file. Here is my code so far. It runs without errors, but doesn't generate any output. My guess is this is probably related to the nested if statements. I have never worked with those before.

    #!/usr/bin/perl 
    use strict; 
    use warnings; 

    my ($directory) = @ARGV;
    my $dir = "$directory";
    my @ArrayofFiles = glob "$dir/*";
    my $count = 0; 

    open(OUT, ">", "/path/to/output_$count.txt") or die $!; 

    foreach my $file(@ArrayofFiles){
         open(my $fastas, $file) or die $!;
         while (my $line = <$fastas>){
              $count++;
              if ($line =~ m/(^>)([a-z]{5})/i){
                    my $header = $2;
              if ($line !~ /^>/){
                    my $sequence .= $line;
                    if ($sequence =~ m/(([a-z]{1})($header)([a-z]{1}))/i){
                            my $matchplusflanks = $1;
                            print OUT ">", $matchplusflanks, "\n", $header, "\n";
                    }
              }

              }
          }
      }

How can I fix this code? Thanks.

Aucun commentaire:

Enregistrer un commentaire