I have a dataframe with trade flows (exports) between countries of origin and countries of destination (which are specified in two different columns of the df). I need to clean the data and delete rows where the country of destination matches the country of destination. I used the following code:
dfn = dfn[dfn["Destination"] != dfn["Origin"]]
However, I realized I actually need to keep the lines where "World" is both in destination and origin (i.e. total world exports towards the world). How can I delete all rows where destination == origin except for the rows where world == destination == origin?
I was thinking of interating through my ~2 million rows and delete only those where a certain conditionality applies. I tried something along those lines, but it doesn't really work. Could you please help me?
for index, row in dfn.iterrows():
if row['Destination'] == row['Origin'] and row['Destination'] =! 'World':
df.drop(index, inplace=True)
Many thanks in advance
Aucun commentaire:
Enregistrer un commentaire