I have one file with multiple lines (reads from a genome) and they are sorted (based on their locations). Now I want to loop over these lines and if multiple lines have the same ID (column 4), I want to keep either keep the first, if column 3 is a plus or the last, if column three is a minus. This is m code but it seems like my variable (lastID) is not properly updated after each line. Tips are much appreciated.
awk 'BEGIN {lastline=""; lastID=""}
{if ($lastline != "" && $4 != $lastID)
{print $lastline; lastline=""};
if ($3 == "+" && $4 != $lastID)
{print $0; lastline=""}
else if ($3 == "+" && $4 == $lastID)
{lastli=""}
else if ($3 == "-")
{lastline=$0};
lastID=$4
}' file
Aucun commentaire:
Enregistrer un commentaire