vendredi 20 novembre 2015

Better approach to doing ifelse by group

I have dataset with ID, Weight, Date, Status and Baseline column as follows

  Id     Weight     Date         Status     Baseline
  5      121.23     NA           No         NA
  5      134.12     NA           No         NA
  5      167.63     NA           No         NA
  5      139.22     NA           No         NA
  23     232.19     11/10/2009   Yes        11/06/2015
  23     274.13     10/03/2004   Yes        11/06/2015
  23     134.23     07/11/2006   Yes        11/06/2015

If status is No, then I want the new column Recent_Weight to store the weight closest to the SysDate () for that Id. If status is Yes, then I want the Recent_Weight to be the the weight closest to the date in the Baseline column.

  Id     Weight     Date         Status     Baseline      Recent_Weight
  5      121.23     04/16/1995   No         NA            134.12 
  5      134.12     12/17/2008   No         NA            134.12  
  5      167.63     03/15/2002   No         NA            134.12
  5      139.22     08/15/2001   No         NA            134.12
  23     232.19     11/10/2009   Yes        11/06/2001    274.13    
  23     274.13     10/03/2004   Yes        11/06/2001    274.13
  23     134.23     07/11/2006   Yes        11/06/2001    274.13

Right now I am doing this using mutate and group_by as follows

library(dplyr)
Test2 %>% 
  group_by(Id) %>% 
  mutate(New_Weight = ifelse(Status== "No",   
  as.numeric(Weight[which.min(abs(Sys.Date() - Date))]), 
  as.numeric(Weight[which.min(abs(Baseline   - DATE))])))

However I get very inconsistent results. I have to shutdown and restart everytime or else I get wrong answers(I know this is very strange). I am wondering if anybody can suggest me a better way to obtain the same results. Thanks y'all.I appreciate your time.

Aucun commentaire:

Enregistrer un commentaire