jeudi 5 avril 2018

Python Pandas replace values in one column based on conditional in multiple other columns

Working with the dataframe df:

Product_ID | Category_A   | Category _B
1232             0              0 
1343             Unknown        X
2543             Nan            0 
2549             Y              Y
0349             X              X
8533             Y              X

I would like to create a new column Category_Final, with the following rules:

  • If Category_A is 0, Unknown or Nan, Category_Final should be "Unknown"
  • If Category_A is the Same as Category_B, Category_Final should be 0
  • If Category_A is different than Category_B,Category_Final should be X

Expected Output:

Product_ID | Category_A   | Category _B | Category_Final
1232             0              0            Unknown
1343             Unknown        X            Unknown
2543             Nan            0            Unknown
2549             Y              Y            0
0349             X              X            0
8533             Y              X            X

I managed to get the logic for 0 and X, but I don't know how to include the Unknown Logic.

df['Category_Final'] = np.where(df['Category_A'] != df['Category_B'], 'X', '0')

Thank you!

Aucun commentaire:

Enregistrer un commentaire