lundi 29 juin 2020

mutate new column based on conditions in another row in R

I am working with a dataset of animal behaviors, and am trying to create a new column ("environment") based on conditions fulfilled in another row. Specifically, I want the new column to return "water" if the behavior falls between the start/stop times of the behavior "o_water", and "land" if it falls outside these bounds. If this is unclear here is a minimal example:

library(dplyr) 
library(magrittr)

otters <- data.frame(
  observation_id = 1,
  subject = 1,
  behavior = c("o_water", "run", "sit", "o_land", "walk"),
  start_time = c(1,1,2,6,6),
  stop_time = c(5,3,4,10,9)
)

#this does it, but manually. I need to go over a large dataset and search for conditions
otters <- otters %>%
  group_by(subject, observation_id, behavior) %>%
  mutate(environment = ifelse(start_time >= 1 & stop_time <= 5, "water", "land"))

This is the output desired.

Groups:   subject, observation_id, behavior [5]
  observation_id subject behavior start_time stop_time environment
           <dbl>   <dbl> <fct>         <dbl>     <dbl> <chr>      
1              1       1 o_water           1         5 water      
2              1       1 run               1         3 water      
3              1       1 sit               2         4 water      
4              1       1 o_land            6        10 land       
5              1       1 walk              6         9 land       
> 

The second set of commands is sort of what I want, but I need this to search out and apply it to an entire dataset rather than typing out each parameter. The grouping is so the functions are performed over the applicable rows; in the full dataset, there are multiple subjects and observation_id's.

I've tried using when() and case_when() to no avail, but I am very novice level at R so would appreciate any help!

Apologies for any missteps I've done. I haven't been able to find a problem quite like this elsewhere on stackoverflow.

Aucun commentaire:

Enregistrer un commentaire