lundi 22 juillet 2019

Understanding NAs in ifelse

I have two variables (V1 and V2) based on which I want to create a third variable (Y1). I tried to use a nested ifelse statement but I cannot figure out why the code is working the way it is and how I can make it work the way I want it to.

This is a crosstable of V1 and V2:

                     No  Yes <NA>
  normal           1543   38    9
  early            1015   29   11
  late              270   25    3
  <NA>              301  226   14

Now I want to code Y1 as "Yes" if V1 is "late" or V2 is "Yes", even if the other variable is NA (total N = 591). For the remaining observations, I want to code Y1 as "No" if V1 is "early" or "normal", even if V2 is NA (N = 2578). Y1 should be NA for the rest of the observations (N = 315).

The code I used so far is:

data$Y1 <- ifelse(data$V1=="late" | data$V2=="Yes", "Yes",
           ifelse(data$V1=="normal" | data$V1=="early", "No", NA))

But I always end up with the 20 observations that have an NA for V2 and "early" or "normal" for V1 coded as NA for Y1.

After reading several other questions regarding using NA in ifelse, I understand a bit better what is happening, especially what is happening in the second line of my code. But I still have 2 questions: 1) Why does the first ifelse statement correctly recode the 3 observations that have an NA for V2 but a "late" for V1 into "Yes"? 2) How can I code this differently to get what I want?

Aucun commentaire:

Enregistrer un commentaire