dimanche 10 novembre 2019

Python: if any(s in line for s in list): print line ... not working

I'm trying to extract only the lines in a file that contain a string within a list, for example MSTRG.2 is in my list, I want to have the line which contains this in my outfile. I've used the code below, but for some reason the lines that are extracted don't necessarily contain a string in the list.

id_list = []

for line in gff_compare:
    split_line = line.strip().split('\t')
    class_code = split_line[2]

    if class_code == 'u':
        if split_line[3] not in id_list:
            id_list.append(split_line[3])


for line in feature_counts:
    split_line_2 = line.strip().split('\t')
    string_ids = split_line_2[0]

    if any(s in string_ids for s in id_list):
        outfile.write(line)

outfile.close()

id_list contains only 1511 elements whereas outfile has over 30,000 lines (contains lines which have a string in the list and lines which don't have a string in the list). Can't work out why it's not only pulling out the lines I want based on strings in the list.

Any help appreciated! Thanks!

Aucun commentaire:

Enregistrer un commentaire