I have a list with all country names in the world and generated fuzzy names which I appended into the list with countries. An example with two countries could be:
"USA", "CHINA", "#USA", "U#SA", "US#A", "USA#", "#CHINA"
etc... This is done for every country, hence the fuzzy list has length = 7648.
I also have a list with sentences (transaction information), and I want to check the country of origin, but it might be disturbed by # hence why it was introduced.
In excel, I must print the first column with the transaction information and the second column with the country extracted - however, it seems like I am overwriting the cells. Any suggestions for what is wrong? Transaction country should be unique. If no country is found, it enters "N/A".
for i in range(len(transactions)):
ws.write(i, 0, transactions[i]) #Paste transaction data into column 0
for country in fuzzy_countries[i]:
for i in range(len(transactions)):
if country in transactions[i]:
ws.write(i,1, country.replace("#","")) #Removes the "#" pastes country into column 1
else:
ws.write(i,1, "N/A")
Any suggestions are appreciated.
Thanks.
Aucun commentaire:
Enregistrer un commentaire