mercredi 24 novembre 2021

Check if values in a list occur in a file

I have a file which looks like below, however you should note that in reality the file contains more then 100.000 records.

blue    black    red      250
red     black    blue     140
black   yellow   purple   100
orange  blue     blue     140
blue    black    red      250
red     black    blue     140
black   yellow   purple   700
orange  blue     blue     200

I also have a list which contains the following values my_list = ['140', '700', '800']

Now I want the following:

  1. If one of the values of my_list occurs in the file row[3] I want to append the whole record to a new list.
  2. If one of the values of my list does not occur in the file row[3] I want to append the value itself and the rest of the values should be 'unknown'.

This is my code:

new_list = []
with open(my_file, 'r') as input:
                reader = csv.reader(input, delimiter = '\t')
                row3_list = [] 
                for row in reader:  
                    row3_list.append(row[3])                  
                    for my_number in my_list :
                        if my_number in row3_list :
                            new_list.append(row)       
                        elif my_number not in row3_list :
                            new_list.append(['Unknown', 'Unkown', 'Unkown', row[3]])

This is my desired output:

red     black    blue     140
orange  blue     blue     140
red     black    blue     140
black   yellow   purple   700
unknown unkown   unkown   800

My problem: Like I mentioned my file contains a bulk of records could be more then 100.000+. So above way is taking ages. I have been waiting for output for about 15 minutes now but still nothing.

Aucun commentaire:

Enregistrer un commentaire