lundi 7 décembre 2020

How to process multiple files based on the date in their names

Let's asume I have a structure like this:

Folder1
       `XX_20201212.txt`
Folder1
       `XX_20201212.txt`
Folder1
       `XX_20201212.txt`

My current script collects the 3 files in each folder, processes them and makes 1 file of it. So right now my scripts does the job for 1 date.

Now lets asume the structure has changed to this:

Folder1
       `XX_20201201.txt`
       `XX_20201202.txt`
Folder1
       `YY_20201201.txt`
       `YY_20201202.txt`
Folder1
       `ZZ_20201201.txt`
       `ZZ_20201202.txt`
       `ZZ_20201203.txt`

I want my script to do the same now but for multiple dates. I want my script to check if a file has a date in its name which is also present in a list named missing_dates and if that file is available in each directory. If so I want to collect it and process it into 1 file. So if we assume 20201201, 20201202 and 20201203 are in missing_list. The following needs to happen.

  1. The script will process the files of XX_20201201.txt, YY_20201201.txt and ZZ_20201201.txt into 1 file because that date is present in missing_dates AND its present in every directory.
  2. The script will process the files of XX_20201202.txt, YY_20201202.txt and ZZ_20201202.txt into 1 file because that date is present in missing_dates AND its present in every directory..
  3. The script will NOT process the file of ZZ_20201203.txt because that date is not present in every directory even though its present in the missing_dates.

So actually shortly said: 3 files with same date (in 3 different directories) with a date that is present in missing_dates = proceed

Please note that below code which is proceding the files into 1 file is already working, the underlying problem is that I have to adjust my loop so that it will always process more than 1 date. I dont know how to do that....

This is the code that reads the files:

for root, dirs, files in os.walk(counter_part):
    for file in files:
        date_files= re.search('_(.\d+).', file).group(1) 
        with open(file_path, 'r') as my_file:
            reader = csv.reader(my_file, delimiter = ',')
            next(reader)
            for row in reader:
                if filter_row(row):                      
                    vehicle_loc_dict[(row[9], location_token(row))].append(row)
    

Aucun commentaire:

Enregistrer un commentaire