vendredi 29 octobre 2021

Add values into a data frame under conditions of other data frame

This may be a long shot, but I am trying to add values from df1 into df2 under certain conditions. Let me give examples of my data frames:

df1
Date   Time  TimeSlot          Behavior  BehaviorTime
10.30  1030  Morning Visitors  Startle   142
10.30  1030  Morning Visitors  Retreat   155
10.30  1030  Morning Visitors  Chase     187
10.31  830   Keeper Feeding    Startle   133
10.31  830   Keeper Feeding    Chase     139

df2
SessionStart       ScanTime  Val1  Val2    Temp  Weather
10/30/21 10:33:42  10:34:42  A     60-70   68    Partly Cloudy
10/30/21 10:33:42  10:35:42  B     70-80   68    Partly Cloudy
10/30/21 10:33:42  10:36:42  A     70-80   68    Partly Cloudy
10/30/21 10:33:42  10:37:42  B     70-80   68    Partly Cloudy
10/30/21 10:33:42  10:38:42  C     70-80   68    Partly Cloudy
10/31/21 08:35:23  08:36:23  A     40-50   77    Sunny
10/31/21 08:35:23  08:37:23  C     90-100  77    Sunny
10/31/21 08:35:23  08:38:23  C     90-100  77    Sunny
10/31/21 08:35:23  08:39:23  C     90-100  77    Sunny
10/31/21 08:35:23  08:40:23  C     90-100  77    Sunny

To explain a bit further, df1 is a data frame that is recording behaviors of an animal. The session started at approximately 10:30 AM (1030) on 10/30/21 (10.30). Each time a behavior occurred, the observer would record it. The "BehaviorTime" column is when the observer recorded the behavior (in seconds). So, the first observation of df1 was recorded 142 seconds into the observation session.

The second data frame, df2, is recording the environment of the observation session. Each session was about 30 minutes, and certain values were recorded at each minute (30 observations per session). Some values change, some values stay the same.

I would like to find a way to integrate the "Behavior" column from df1 into df2 during the time that it occurs. The time in df1 is approximate, while the time in df2 is accurate. Like I said, the "BehaviorTime" column represents the number of seconds into the session. So, 142 seconds (first observation in df1) after 10:33:42 (observation start time from df2) would be 10:36:24, which would mean the “Startle” value would be included in the df2 row with 10:36:42 because that time is closest to the calculated time. Essentially, it would look like this:

df3
SessionStart       ScanTime  Val1  Val2    Temp  Weather        Behavior
10/30/21 10:33:42  10:34:42  A     60-70   68    Partly Cloudy
10/30/21 10:33:42  10:35:42  B     70-80   68    Partly Cloudy
10/30/21 10:33:42  10:36:42  A     70-80   68    Partly Cloudy  Startle
10/30/21 10:33:42  10:36:42  A     70-80   68    Partly Cloudy  Retreat
10/30/21 10:33:42  10:37:42  B     70-80   68    Partly Cloudy  Chase
10/30/21 10:33:42  10:38:42  C     70-80   68    Partly Cloudy
10/31/21 08:35:23  08:36:23  A     40-50   77    Sunny
10/31/21 08:35:23  08:37:23  C     90-100  77    Sunny          Startle
10/31/21 08:35:23  08:37:23  C     90-100  77    Sunny          Chase
10/31/21 08:35:23  08:38:23  C     90-100  77    Sunny
10/31/21 08:35:23  08:39:23  C     90-100  77    Sunny
10/31/21 08:35:23  08:40:23  C     90-100  77    Sunny

I really hope this makes sense. I have tried different functions in dplyr, but I honestly don’t even know where to start. Any help is appreciated!

Aucun commentaire:

Enregistrer un commentaire