mercredi 14 octobre 2020

How to loop through list of data frames and assign columns based using if condition

I have a list of dataframes that I want to loop through and check if a particular string is located in id column. If so, get the index of that row and assign a new column and name every row below that index yes. If id column does not exist in that column, then skip that dataframe and assign another condition. My sample code is below:

sample_df_2 = pd.DataFrame(data={
  'id': ['A', 'B', 'C','G','D','E'],
  'n' : [  1,   2,   3, 5,  5,  9],
  'v' : [ 10,  13,   8, 8,  4 ,  3],
  'z' : [5,    3,    6, 9,  9,   8]
})

sample_df_1 = pd.DataFrame(data={
  'id': ['L', 'K', 'C','G','D','E'],
  'n' : [  1,   2,   3, 5,  5,  9],
  'v' : [ 10,  13,   8, 8,  4 ,  3],
  'z' : [5,    3,    6, 9,  9,   8]
})

def assign_new_column(data_frame):
    if data_frame['id'].str.contains('A','C').any():
        index_A=data_frame[data_frame['id']=='A'].index.tolist()
        index_C=data_frame[data_frame['id']=='C'].index.tolist()

        data_frame['Yes']=np.select([data_frame.index<=index_A],['Yeaaap'])
        data_frame['No']=np.select([data_frame.index<=index_C],['Yeaaap'])

    else:
        index_G=data_frame[data_frame['id']=='G'].index.tolist()
        index_D=data_frame[data_frame['id']=='D'].index.tolist()

        data_frame['Yes']=np.select([data_frame.index<=index_G],['Noop'])
        data_frame['No']=np.select([data_frame.index<=index_D],['supp'])

    return data_frame

index=sample_df[sample_df['id']=='A'].index.tolist()

df=[]
for i in range(len(df_list)):
    df.append(assign_new_column(df_list[i]))
pd.concat(df)

This is giving me the following output.

    id  n   v   z   Yes No
0   A   1   10  5   Yeaaap  Yeaaap
1   B   2   13  3   0   Yeaaap
2   C   3   8   6   0   Yeaaap
3   G   5   8   9   0   0
4   D   5   4   9   0   0
5   E   9   3   8   0   0
0   L   1   10  5   Noop    supp
1   K   2   13  3   Noop    supp
2   C   3   8   6   Noop    supp
3   G   5   8   9   Noop    supp
4   D   5   4   9   0   supp
5   E   9   3   8   0   0

But this is not correct as it is sort of overwriting the strings.

Can anyone help me how to solve this in an efficient manner?

Aucun commentaire:

Enregistrer un commentaire