dimanche 4 novembre 2018

loop over dataframe column python

I have three dataframes:

df1, df2, df3.

Each of these dataframes has a variable (column1, column2, column3, respectfully) that has an id with it.

I have a master dataframe, called master_df, with column_master. This column, also, has an ID with it.

I would like to write a loop so that if column_master has any of the ids from df1, df2, or df3, create a new column called 'flag' and flag it: flag1 if the id was found in df1, flag2 if found in df2, flag3 if found in df3.

I attempted this so far, but I am at a loss as to how to finish the code:

def create_flag(df):

if df['column_master'] in df1['column1']:
    return df['flag']==flag_1  
elif df['column_master'] in ('column2'):
    return df['flag']==flag_2   
elif df['column_master'] in ('column3'):
    return df['flag']==flag_3 

    return df 

create_flag(master_df)

This throws off an error saying it does not recognize my column names. What am I doing wrong? and is there a better way to write this?

Aucun commentaire:

Enregistrer un commentaire