mardi 25 février 2020

How to handle or ignore NAs when using ifelse to mutate a new column with multiple conditions (dplyr)

I am a newcomer to dplyr and tried to create a new composite variable from three different age variables using dplyr and ifelse. I made a data frame to explain the situation as follows:

z <- data.frame("j6" = c(6, 19, NA, NA, NA, NA, NA, 8, 20, 20, NA), 
                "j7" = c(27, 20, NA, 7, 19, NA, NA, 20, 30, 9, NA),
                "j8" = c(8, 22, NA, 20, NA, 8, 30, NA, NA, NA, 3))

z <- z %>% 
        mutate(., age_event = NA) %>% 
        mutate(., age_event = ifelse(j6 < 18 | j7 < 18 | j8 < 18, 1, 0))

My expectations:

  • The three columns (j6, j7, and j8) indicate ages, and if at lease one of them is less than 18 year-old, the new column (age_event) should be "1", otherwise 0.
  • And if the two of the three columns are both 18-year or older and the other is NA, the age_event variable should be 0 .
  • Likewise if the one of the three columns is 18-year or older and the others are NAs, the age_event variable should be 0.
  • Also it is NA if all of the three columns are NAs.

However, the result and problems are shown as follows:

> z
   j6 j7 j8 age_event
1   6 27  8         1
2  19 20 22         0
3  NA NA NA        NA
4  NA  7 20         1
5  NA 19 NA        NA  <-- should be 0, but NA
6  NA NA  8         1
7  NA NA 30        NA  <-- should be 0, but NA
8   8 20 NA         1
9  20 30 NA        NA  <-- should be 0, but NA
10 20  9 NA         1
11 NA NA  3         1

I'd like to know if there is a way to turn 5th, 7th, and 9th observations above to 0s using mutate and ifelse. Any suggestions would be greatly appreciated!

Aucun commentaire:

Enregistrer un commentaire