I have dataset with ID, Weight, Date, Status and Baseline column as follows
Id Weight Date Status Baseline
5 121.23 NA No NA
5 134.12 NA No NA
5 167.63 NA No NA
5 139.22 NA No NA
23 232.19 11/10/2009 Yes 11/06/2015
23 274.13 10/03/2004 Yes 11/06/2015
23 134.23 07/11/2006 Yes 11/06/2015
If status is No, then I want the new column Recent_Weight to store the weight closest to the SysDate () for that Id. If status is Yes, then I want the Recent_Weight to be the the weight closest to the date in the Baseline column.
Id Weight Date Status Baseline Recent_Weight
5 121.23 04/16/1995 No NA 134.12
5 134.12 12/17/2008 No NA 134.12
5 167.63 03/15/2002 No NA 134.12
5 139.22 08/15/2001 No NA 134.12
23 232.19 11/10/2009 Yes 11/06/2001 274.13
23 274.13 10/03/2004 Yes 11/06/2001 274.13
23 134.23 07/11/2006 Yes 11/06/2001 274.13
Right now I am doing this using mutate and group_by as follows
library(dplyr)
Test2 %>%
group_by(Id) %>%
mutate(New_Weight = ifelse(Status== "No",
as.numeric(Weight[which.min(abs(Sys.Date() - Date))]),
as.numeric(Weight[which.min(abs(Baseline - DATE))])))
However I get very inconsistent results. I have to shutdown and restart everytime or else I get wrong answers(I know this is very strange). I am wondering if anybody can suggest me a better way to obtain the same results. Thanks y'all.I appreciate your time.
Aucun commentaire:
Enregistrer un commentaire