Im new to pandas, and python really... but i have one last task in a script that i cannot get my head around.
The data I have isnt allowed to be shared sadly otherwise I would.
I have 3 conditions that i want to filter by.
I have a dataframe of Vehicles that have run through a software update and a status update of that software download.
condition 1 - If a vin has went through the same campaign number and had a success then I wish to keep that success entry and delete the other entry
condition 2 - If a VIN has went through the same campaign and not had a success I want to keep the latest entry and delete the older.
condition 3 - If a VIN has only went through a campaign once do nothing and keep that entry.
I have attacked this a few ways but im not advanced enough to do it correctly. I need help
Brainstorming came up with this
for i in df:
if (df['VIN'].value_counts()) >= 2:
if df['Campaign Name Split'] == df['Campaign Name Split']:
if df['Planned Start Date'] >= df['Planned Start Date']:
print(i)
If there was a match, i dont know how to directly reference match row 1 and match row 2... which is one problem i face :D
so i went on to something like.
unique_VINS_List = set(df['VIN'])
for VIN in unique_VINS_List: temp_df = df[df['VIN'].isin([VIN])]
but again, im getting lost...
so I then thought can i use Groupby?
df.pd.groupby['VIN'], ['Campaign Name']
then some sort of nested IF statements....
i have no idea where to go.
Aucun commentaire:
Enregistrer un commentaire