Hi I have the following df with a duration column where I would like to create another 'duration_category' column based on duration bins
duration
24
74
2
17
26
.....
120
37
43
I have written the following function
def return_duration_category(duration):
cat = ''
if (duration > 0 and duration <=12):
cat = '0-12'
elif (duration > 12 and duration <= 24):
'13-24'
elif (duration > 24 and duration <= 36):
cat ='25-36'
elif (duration > 36 and duration <= 48):
'37-48'
elif (duration > 48 and duration <= 60):
cat ='49-60'
elif (duration > 60):
cat ='60+'
return (cat)
and applied it to the df in this fashion
df['duration_category'] = rentals_add_friction['duration'].apply(lambda x: return_duration_category(x))
The outpout works for some values, but returns empty cells for others and it seems to be quite random. I don't understand how some cells are empty as the conditions in the if statement should include all values. Can anyone help?
Output is like this:
duration duration category
24
74 60+
2 0-12
17
26 25-36
.....
120 60+
37
43
Aucun commentaire:
Enregistrer un commentaire