What I am trying to do is make a simple statement that says if a column is not = 'nan', then create a new column in the dataframe and make that the value for each row.
ID1 ID2
Apple nan
Orange nan
nan Pear
nan Grape
Ideally it would then look like so:
ID1 ID2 MasterID
Apple nan Apple
Orange nan Orange
nan Pear Pear
nan Grape Grape
I've tried using the following:
df['MasterID'] = ''
df.loc[df['ID1'] != 'nan','MasterID'] = df['ID1']
df.loc[df['ID2'] != 'nan','MasterID'] = df['ID2']
But what it's doing is just prioritizing the last statement to undo what the second line creates. Same thing when I use numpy where statement like this:
df['MasterID'] = np.where(df['ID1'] != 'nan',
df['ID1'],
df['ID2'])
Would like to also use something where I could possibly accomplish this in the future with 3+ columns. Appreciate any guidance.
Aucun commentaire:
Enregistrer un commentaire