mardi 5 avril 2016

perl start reading file from given string

I need to open a very messy csv file (by messy I mean blank columns and rows in between data and I only need data from some of the columns) and only start allocating data to columns once it reaches a given line with the word 'Information A' or 'Informasie A' (the files are in one of two different languages but hold the same format). The table has a format more or less as follows:

(n) Name
(n) General info
(n) ID
(n) Contact
(n) General
(n)
(a) Information A
(a)
(a) Name
(a) one
(a) two
(a) three
(a)
(a) four
(a) five
(a) Total
(b) Information B
(b)
(b) Name

The basic outline of the program was written for me which initially worked where the first section of extra details was designated by $part='n' (shown as (n) above), while that which comes after 'Information A' was designated 'a' and so forth. However, I think I may have deleted some of the code which made the whole extraction unusable. I tried fixing it but that has seen to do more damage than good so I'm trying to do it over from scratch and hopefully learn an easier way to do it in the process.

The code that I have so far is as follows:

open (IN_F, "$file") or die "Can't open $file";

  my %file;

while (<IN_F>){

  my $line = $_;
     $line =~ s/\s*$//g;
     $line =~ s/\-//g;

  my $part='n';
     $part='a' if (substr($line,0,13) eq 'Information A');
     $part='b' if (substr($line,0,13) eq 'Information B');

  next if $part='a';
  last if substr($line,0,20) eq 'Litter Information B';

  print "$line\n";
}
exit;

where I want the print to then be:

Name
one
two
three
four
five
Total

I found similar questions that had different solutions; some of them used line number but mine aren't constant. A different solution used '..', which I tried but I think I didnt apply it correctly.

Any help will be greatly appreciated!

Aucun commentaire:

Enregistrer un commentaire