lundi 22 mars 2021

Newly created column in a data frame need to be updated with values based on condition ,from another column

DF has four columns and column 'Id' in unique and it is grouped by column 'idhogar'. column ' parentesco1' has status 0 (or) 1 . 'Target' columns has values,which are different for various rows under same column values of 'idhogar'

INDEX  Id     parentesco1   idhogar Target
0   ID_fe8c32eba    0   4616164 2
1   ID_ca701e058    1   4616164 2
2   ID_5ad4372cd    0   4983866 3
3   ID_1e320689c    1   4983866 3
4   ID_700e30a8d    0   5905417 2
5   ID_bc99ecfb8    0   5905417 2
6   ID_308a05a16    1   5905417 2
7   ID_00186dde5    1   7.56E+06    4
8   ID_34570a74c    1   20713493    4
9   ID_b13870a19    1   27651991    3
10  ID_74e989389    1   45038655    4
11  ID_726ba7d34    0   60027579    4
12  ID_b75d7c648    0   60027579    4
13  ID_37e7b3aaa    1   60027579    4
14  ID_396da5a70    0   104578907   2
15  ID_4381374bb    1   104578907   2
16  ID_272a9b4d5    0   119024319   4
17  ID_1225f3779    0   119024319   4
18  ID_fc5dfaa2e    0   119024319   4
19  ID_7390a3f99    1   119024319   4

New column'Rev_target' created ,need to have the value of 'Target' of row having ' parentesco1' as 1 for all the rows falling under the group of same 'idhogar'.

I tried the following but not successful.

for idhogar in df['idhogar'].unique():
    if len(df[df['idhogar'] == idhogar]['Target'].unique())!= 1:
        rev_target_val=df[(df['idhogar']== idhogar) & (df['parentesco1']==1)]['Target']
        df['Rev_target']=rev_target_val
        
# NOT WORKING AS REQUIRED ---- gives output as NaN in all rows  of newly created column  

Tried the below but throwing error

for idhogar in df['idhogar'].unique():
    rev_target_val=df[(df['idhogar']== idhogar) & (df['parentesco1']==1)]['Target']
    df['Rev_target']=np.where(len(df[df['idhogar'] == idhogar]['Target'].unique())!= 
    1,rev_target_val,df['Target'])

ValueError: operands could not be broadcast together with shapes () (0,) (9557,)

Tried the below but not working as intended,gives same value as 2 in all the rows of new'Rev_target' column

for idhogar in df['idhogar'].unique():
    rev_target_val=df[(df['idhogar']== idhogar) & (df['parentesco1']==1)]['Target']
    df['Rev_target']=df.apply(lambda x: rev_target_val if (len(df[df['idhogar'] == idhogar] 
    ['Target'].unique())!= 1) else df['Target'],axis=1)

Would appreciate a solution from you and thanks in advance.

Aucun commentaire:

Enregistrer un commentaire