I have some code shown below and I want to improve its speed as it is very slow. This is only a section of it, so where there are 40-50 nested if statements it performs badly. I wondered how I could restructure this to loop through as many pdf files as necessary and execute the code below?
ListRefCount = -1
ChosenID = []
ChosenName = []
# Collate PDF Files
for x in FileLocation:
if x.endswith(".pdf"):
# Initiate Client Count
ListRefCount += 1
# Read PDF File
df = pd.DataFrame(tabula.read_pdf(FileLocation + x, pages="all", multiple_tables=False))
table = df.to_string(header=False, index=False, index_names=False).split("\n")
IDNumber = re.search(re.compile(r"ID:(.*?),"), table).group(1).strip()
if len(ID) == 0:
# [1] Assign ID
ChosenID.append(ListRefCount)
elif len(ID) != 6:
# [1] Report Error Message
pdf.cell(200, 10, txt="Patient " + str(ChosenID[ListRefCount]) + " Has Been Assigned A An Invalid ID Number "
"Either Too Large Or Short",
ln=100, align="L")
else:
ChosenID.append(int(IDNumber))
# Next Request
Name = re.search(re.compile(r'Name:(.*?),'), table).group(1).strip()
ChosenName.append(Name)
.....
Aucun commentaire:
Enregistrer un commentaire