mardi 17 septembre 2019

Using the "less than" operator with dates in R

There is a dataframe consisting of three measures of dates (i.e. date1, date2, date3) and additional measures labelled "s1" and "s2". I am trying to create new columns labelled "x1" and "x2" based on these dates and the measures "s1" and "s2". For instance, column "x1" should take on value 3 if date1 is less than or equal to date2, otherwise it should keep the value of s1. Similarly, column "x2" should take on value 3 if date1 is less than or equal to date3, otherwise it should keep the value of s2. Below is a section of the data

df <-
structure(
list(
  id = c(1L, 2L, 3L, 4L,5L),
  date1 = c("1/4/2004", "3/8/2004", "NA", "13/10/2004","11/3/2003"),
  date2 = c("8/6/2002", "11/5/2004", "3/5/2004", 
"25/11/2004","21/1/2004"),
  s1=c(1,2,1,"NA","NA"),
  date3=c("23/6/2006", "24/12/2006", "18/2/2006", "NA","NA"),
  s2=c("NA","NA",2,"NA","NA")
 ),
.Names = c("id", "date1","date2","s1","date3","s2"),
 class = "data.frame",
 row.names = c(NA,-5L),
 col_types = c("numeric", "date","date","numeric","date","numeric")
 ) 

I have tried the following code

df$x1<-ifelse(df$date1<=df$date2,3,s1)
df$x2<-ifelse(df$date1<=df$date3,3,s2)

It gives

  id      date1      date2 s1      date3 s2 x1 x2
1  1   1/4/2004   8/6/2002  1  23/6/2006 NA  3  3
2  2   3/8/2004  11/5/2004  2 24/12/2006 NA  2 NA
3  3         NA   3/5/2004  1  18/2/2006  2  1  2
4  4 13/10/2004 25/11/2004 NA         NA NA  3  3
5  5  11/3/2003  21/1/2004 NA         NA NA  3  3

From this, it appears "NA" in column"x2" does not respond to the code since "3/8/2004" is less than "24/12/2006" so I expect to have 3 in place of "NA" in column "x2". Could anyone clarify why this is happening and how it can be resolved. Your help is greatly appreciated.

Aucun commentaire:

Enregistrer un commentaire