I'm trying to write a Perl script to do the following:
I have a file that contains a list of files called "filelist.txt" (i.e. file1.txt, file2.txt, etc.), along with a column that contains numeric values that need to read & used to update the corresponding row in each of the files.
Note: All files are tab delimited.
filelist.txt (input)
file1.txt 1.word, 3.word, 5.word
file2.txt 2.word
file3.txt 4.word, 5.word
file4.txt 3.word, 4.word, 5.word
file5.txt 4.word
Each file has 'x' number of rows:
file1.txt (input)
1 word1 word2 word3
2 word1 word2 word3
3 word1 word2 word3
4 word1 word2 word3
5 word1 word2 word3
What I need to do is read column 1 in "filelist.txt" to get the filename (i.e. file1.txt). Then read column 2 to get the row numbers (i.e. 1.word, 3.word, 5.word). Once I have the row numbers, I need to open "file1.txt". So each row number that matches the number(s) extracted from filelist.txt for file1.txt, print each row in "file1.txt" but print the rows (as shown below) where the extracted number matches the row number. In this example, "1" from "1.word" matches row 1, "3" from "3.word" matches row "3", "5" from "5.word" matches row "5". This needs to be done for every file in "filelist.txt".
file1.tmp (output)
<strike>1</strike> <strike>word1</strike> <strike>word2</strike> <strike>word3</strike>
2 word1 word2 word3
<strike>3</strike> <strike>word1</strike> <strike>word2</strike> <strike>word3</strike>
4 word1 word2 word3
<strike>5</strike> <strike>word1</strike> <strike>word2</strike> <strike>word3</strike>
OUTPUT REQUIRED (based on files in filelist.txt).
file1.txt needs rows 1, 3, 5 to be updated. 2, 4 stay as is.
file2.txt needs row 2 to be updated. 1, 3, 4, 5 stay as is.
file3.txt needs rows 4, 5 to be updated. 1, 2, 3.
file4.txt needs rows 3, 4, 5 to be updated. 1, 2 stay as is.
file5.txt needs row 4 to be updated. 1, 2, 3, 5 stay as is.
SCRIPT
use Cwd;
$dir = getcwd;
$nofile = "FILE NOT FOUND";
$strike = "<>";
$tab = " ";
my $filelist = "filelist.list";
open INFILE, "$dir/$filelist" or die "$nofile\n";
while (my $line=<INFILE>) {
chomp($line);
my ($filename, $rownum) = split /\t/, $line;
@rowarray = split(/, /, $rownum);
my $arraysize = @rowarray;
open INFILE2, "$dir/$filename" or die "$nofile\n";
$filename =~ s/.txt//;
$tmpfilename = $filename;
open (OUTFILE, ">$dir/$tmpfilename.tmp");
while (my $line2=<INFILE2>) {
chomp ($line2);
my ( $fn, $col1, $col2, $col3 ) = split (/\t/, $line2);
for ($i = 0; $i < $arraysize; $i++) {
$scratched = $rowarray[$i];
my ($substring2) = $scratched =~ /(.*)?\./;
if ($substring2 == $fn) {
print "Match: $substring2 == $fn\n\n";
print OUTFILE "$strike$fn$strike$tab$strike$col1$strike$tab$strike$col2$strike$tab$strike$col3$strike\n";
}
elsif ($substring2 != $fn) {
print "No match: $substring2 != $fn\n\n";
print OUTFILE "$fn$tab$col1$tab$col2$tab$col3\n";
}
}
}
}
close (INFILE);
close (INFILE2);
close (OUTFILE);
CURRENT OUTPUT
<>1<> <>dogs<> <>word2<> <>word3<>
1 dogs word2 word3
1 dogs word2 word3
2 word1 word2 word3
2 word1 word2 word3
2 word1 word2 word3
3 cats word2 word3
<>3<> <>cats<> <>word2<><>word3<>
3 cats word2 word3
4 word1 word2 word3
4 word1 word2 word3
4 word1 word2 word3
5 frog word2 word3
5 frog word2 word3
<>5<> <>frog<> <>word2<> <>word3<>
Been working on this for a few days and unfortunately, I cannot see how to get this to work properly.
Any suggestions/help would be greatly appreciated.
Thank you in advance.
Billy J.
Aucun commentaire:
Enregistrer un commentaire