mardi 24 août 2021

fill column in df based on comparison of other column with dictionary

I have a dictionary ‘MS_dates_dict’ with project management milestone dates MS2, MS3, MS4

MS_dates_dict =  {'MS02': [Timestamp('2019-06-30 00:00:00')],
 'MS03': [Timestamp('2021-01-31 00:00:00')],
 'MS04': [Timestamp('2021-06-30 00:00:00')]}

I have a dataframe df with hours booked on the project and a timecode in column ‘MM-YYYY’

d= {‘quantity’:[4, 17.25, 5, 2, 3, 4, 5], ‘MM-YYYY’:[ 2021-04-01, 2021-04-01, 2021-04-01, 2021-06-01, 2021-06-01, 2021-01-01, 2021-06-01]}
MS_hours_df = pd.DataFrame(data=d)

MS_hours_df = MS_hours_df.astype({'Quantity' : 'float64'})

MS_hours_df ['MM-YYYY'] =pd.to_datetime(MS_hours_df ['MM-YYYY'], format='%m.%Y')

I want to create a column MS_hours_df [‘milestone’] which contains the corresponding milestone for each row of df based on the value in column ‘MM-YYYY’

  • If ‘MM-YYYY’ is smaller than MS2 values it should become “MS1”
  • If 'MM-YYYY’ Is between MS2 and MS3 it should become “MS2”
  • If ‘MM-YYYY’ is between MS3 and MS4 it should become “MS3”
  • If ‘MM-YYYY’ is above MS4 it should become “MSX”

I used the following code and got a ValueError: truth value is ambigous

if MS_hours_df['MM-YYYY'] < MS_dates_dict['MS02']:
    MS_hours_df['milestone'] == 'MS01'
elif MS_hours_df['MM-YYYY'] >= MS_dates_dict['MS02'] & MS_hours_df['MM-YYYY'] <MS_dates_dict['MS03']:
    MS_hours_df['milestone'] == 'MS03'
elif MS_hours_df['MM-YYYY'] >= MS_dates_dict['MS03']:
    MS_hours_df['milestone'] == 'MSX'

Where is my mistake?

Aucun commentaire:

Enregistrer un commentaire