mercredi 25 mars 2020

How can I speed up the process of a large for loop and nested if statements? [closed]

I have some code shown below and I want to improve its speed as it is very slow. This is only a section of it, so where there are 40-50 nested if statements it performs badly. I wondered how I could restructure this to loop through as many pdf files as necessary and execute the code below?

ListRefCount = -1
ChosenID = []
ChosenName = []

# Collate PDF Files
for x in FileLocation:
    if x.endswith(".pdf"):
        # Initiate Client Count
        ListRefCount += 1
        # Read PDF File
        df = pd.DataFrame(tabula.read_pdf(FileLocation + x, pages="all", multiple_tables=False))
        table = df.to_string(header=False, index=False, index_names=False).split("\n")
        IDNumber = re.search(re.compile(r"ID:(.*?),"), table).group(1).strip()
        if len(ID) == 0:
            # [1] Assign ID
            ChosenID.append(ListRefCount)
        elif len(ID) != 6:
            # [1] Report Error Message
            pdf.cell(200, 10, txt="Patient " + str(ChosenID[ListRefCount]) + " Has Been Assigned A An Invalid ID Number "
                                                                            "Either Too Large Or Short",
                 ln=100, align="L")
        else:
            ChosenID.append(int(IDNumber))   
        # Next Request     
        Name = re.search(re.compile(r'Name:(.*?),'), table).group(1).strip()
        ChosenName.append(Name)

        .....

Aucun commentaire:

Enregistrer un commentaire