jeudi 6 mai 2021

Pandas, how to select values in a group to remove, IF or GroupBy

Im new to pandas, and python really... but i have one last task in a script that i cannot get my head around.

The data I have isnt allowed to be shared sadly otherwise I would.

I have 3 conditions that i want to filter by.

I have a dataframe of Vehicles that have run through a software update and a status update of that software download.

condition 1 - If a vin has went through the same campaign number and had a success then I wish to keep that success entry and delete the other entry

condition 2 - If a VIN has went through the same campaign and not had a success I want to keep the latest entry and delete the older.

condition 3 - If a VIN has only went through a campaign once do nothing and keep that entry.

I have attacked this a few ways but im not advanced enough to do it correctly. I need help

Brainstorming came up with this

for i in df:
if (df['VIN'].value_counts()) >= 2:
    if df['Campaign Name Split'] == df['Campaign Name Split']:
        if df['Planned Start Date'] >= df['Planned Start Date']:
            print(i)

If there was a match, i dont know how to directly reference match row 1 and match row 2... which is one problem i face :D

so i went on to something like.

unique_VINS_List = set(df['VIN'])

for VIN in unique_VINS_List: temp_df = df[df['VIN'].isin([VIN])]

but again, im getting lost...

so I then thought can i use Groupby?

df.pd.groupby['VIN'], ['Campaign Name'] 

then some sort of nested IF statements....

i have no idea where to go.

Aucun commentaire:

Enregistrer un commentaire