vendredi 20 février 2015

AWK - import IF conditions from file

I'm trying to use awk to parse some files and extract only the records that match a set of regular expressions. So I'm trying to pass the set of regular expressions to an IF conditional in this way:



$ awk 'BEGIN{FS="|"; IGNORECASE=1} NR==FNR{a[$0];next} {for (i in a){ if(i) {print $0}}}' file1.txt file2.txt


This is because file1.txt has the list of regex that I want to be searched in file2.txt. For convenience, let us suppose that the two files look like this:



$ cat file1.txt

$4 ~ $2 "foo[^.]*" $3
$4 ~ $3 "[^.]*foo" $2

$ cat file2.txt

1|this|bar|In this line, bar is before foo|
2|not|here|Here, foo is before. Not|
3|First|Second|First comes foo then bar comes second.|


So, in this particular example, my regular expressions are trying to match the words from field $2 and $3 with the string foo in-between and within the same sentence (that's why I'm using [^.]* in field $4. Since I'm not interested if $1 comes before $2 or viceversa (as long as they are in the same sentence with foo in-between) I have the two regex that match both cases and only the third record should be printed.


Since I'm trying to find many patterns in field $4 in many files, my first approach was to make a list, but perhaps there are other ways around


I'd appreciate any help and comments.


Aucun commentaire:

Enregistrer un commentaire