mercredi 10 février 2021

How to select a datetime index range and use if conditions in Pandas

I have a data frame df with 100,000 rows using DateTime index. Let the January case as an example. I would like to create a new column named 'Experiment', which may help me to identify when the experiment starts and ends, with 10 experiments in total.

 df=
                            Place      
        Time               
        2021-01-01 00:00    home         
        2021-01-01 00:01    home       
        2021-01-01 00:02    home        
        2021-01-01 00:03    home     
        ................    ....  
        ................    ....
        2021-01-31 23:57    home
        2021-01-31 23:58    home
        2021-01-31 23:59    home

For example, experiment A starts between 2021-01-01 00:00 and 2021-01-01 00:02 and experiment J starts between 2021-01-31 23:57 and 2021-01-31 23:59. the expected results will be like this.

df=
                            Place  Experiment
        Time               
        2021-01-01 00:00    home      A   
        2021-01-01 00:01    home      A 
        2021-01-01 00:02    home      A  
        2021-01-01 00:03    home     
        ................    ....  
        ................    ....
        2021-01-31 23:57    home      J
        2021-01-31 23:58    home      J
        2021-01-31 23:59    home      J

My approach is like this.

df["experiment"] = ""
df["experiment"] = np.where(df.between_time('2021-01-01 00:00','2021-01-01 00:02'),'A',np.nan)
df["experiment"] = np.where(df.between_time('2021-01-31 23:57','2021-01-31 23:59'),'J',np.nan)

And I just realise that the between_time is not working when includes date. Moreover, I am facing the problem that the Length of values does not match length of index.

Thank you!

Aucun commentaire:

Enregistrer un commentaire