I would like to name the period of the day based on hourly information to my dataframe.
For this, I am attempting the following:
day_period = []
for index,row in df.iterrows():
hour_series = row["hour"]
# Morning = 04:00-10:00
#if hour_series >= 4 and hour_series < 10:
if 4 >= hour_series < 10:
day_period_str = "Morning"
day_period.append(day_period_str)
# Day = 10:00-16:00
#if hour_series >= 10 and hour_series < 16:
if 10 >= hour_series < 16:
day_period_str = "Day"
day_period.append(day_period_str)
# Evening = 16:00-22:00
#if hour_series >= 16 and hour_series < 22:
if 16 >= hour_series < 22:
day_period_str = "Evening"
day_period.append(day_period_str)
# Night = 22:00-04:00
#if hour_series >= 22 and hour_series < 4:
if 22 >= hour_series < 4:
day_period_str = "Night"
day_period.append(day_period_str)
However, when double-checking if the length of my day_period list is the same as that of my dataframe (df)... they differ and they shouldn't. I can't spot the mistake. How can I fix the code?
len(day_period)
>21882
len(df)
>25696
Here's a preview of the data:
timestamp latitude longitude hour weekday
0 2021-06-09 08:12:18.000 57.728867 11.949463 8 Wednesday
1 2021-06-09 08:12:18.000 57.728954 11.949368 8 Wednesday
2 2021-06-09 08:12:18.587 57.728867 11.949463 8 Wednesday
3 2021-06-09 08:12:18.716 57.728954 11.949368 8 Wednesday
4 2021-06-09 08:12:33.000 57.728905 11.949309 8 Wednesday
My end goal is to then append this list to the dataframe.
Aucun commentaire:
Enregistrer un commentaire