dimanche 7 juin 2020

Create new column If Else based on multiple column conditions

I've tried to look into similar questions but, as far I as searched, I couldn't find anything that could help.

I have a daily report that I extract from a data base but one info in there is exactly what need to be delivered. Here's an example on what I extract:

col1           col2
wrongstring    correct
correctstring  correct
correctstring  correct
NaN            correct
NaN            NaN

The info in col2 is already corrected using a dict and replace, and the NaN is missing value from data base and it I need to replace it with the correct string for missing values. Today it is done in Excel with a vlookup and if and I want to implement it inside the script so we could gain some time.

What I want to do is:

If df['col1'] = wrongstring then new column would use df['col2'] value.

If df['col1'] is NaN then new column use df['col2'] value.

If both columns are NaN then the new column should use newstring.

Else keep df['col1'] value.

So far I've come up with this code that brings an error( I understand it's from the .isnull() part, however I couldn't find a way to fix it):

df['newcolumn'] = [x in df['col2'] if x=='wrongstring' else ('newstring' if ((df['col1'].isnull()) and (df['col2'].isnull())) else x in df['col1']) 
                           for x in df['col1']] 

If anyone could help me out with this, maybe the approach I used is not the correct one or i'm missing something. The results should look like this:

col1           col2     newcolumn
wrongstring    correct  correct
correctstring  correct  correctstring  
correctstring  correct  correctstring  
NaN            correct  correct
NaN            NaN      newstring

Thanks for any help. Cheers.

Aucun commentaire:

Enregistrer un commentaire