jeudi 17 septembre 2020

How can I ignore NA's across multiple columns in an if else statement in R?

I have a data frame which looks like this:

     a    b   c   d
10 yes      yes yes yes
11 yes      yes yes yes
12 yes      yes yes yes
13 yes      yes yes yes
14 no      <NA>  no  no
15 no      <NA>  no  no
16 no      <NA>  no  no
17 no      <NA>  no  no
18 no      <NA>  no  no
19 no      <NA>  no  no
20 no      <NA>  no  no

I have an if else statement which creates a new column with values 1,0 based on if the answers to all the previous columns are yes or no. However my code does not account for NA's. This is the code I have used:

y <- x %>%
  mutate(
    health_ever = ifelse(
      e == 'yes    ' |
        b == 'yes' |
        c == 'yes' |
        d == 'yes',
      1,
      0
    )
  )

Here is the code to reproduce it:

x<-structure(
  list(
    a = structure(
      c(6L, 6L, 6L, 6L, 7L, 7L,
        7L, 7L, 7L, 7L, 7L),
      .Label = c(
        "missing",
        "inapplicable",
        "proxy respondent       ",
        "refusal",
        "don't know",
        "yes    ",
        "no     "
      ),
      class = "factor"
    ),
    b = structure(
      c(6L, 6L, 6L, 6L, NA, NA, NA, NA, NA,
        NA, NA),
      .Label = c(
        "missing",
        "inapplicable",
        "proxy",
        "refusal",
        "don't know",
        "yes",
        "no"
      ),
      class = "factor"
    ),
    c = structure(
      c(6L,
        6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L),
      .Label = c(
        "missing",
        "inapplicable",
        "proxy",
        "refusal",
        "don't know",
        "yes",
        "no"
      ),
      class = "factor"
    ),
    d = structure(
      c(6L, 6L,
        6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L),
      .Label = c(
        "missing",
        "inapplicable",
        "proxy",
        "refusal",
        "don't know",
        "yes",
        "no"
      ),
      class = "factor"
    )
  ),
  row.names = 10:20,
  class = "data.frame"
)

How can I change my code to overlook any NAs to still give 1,0 based on the other columns. This is my desired output:

     a            b        c        d            e
1   yes          yes      yes      yes           1
2   yes          yes      yes      yes           1
3   yes          yes      yes      yes           1
4   yes          yes      yes      yes           1
5   no          <NA>       no       no           0
6   no          <NA>       no       no           0
7   no          <NA>       no       no           0
8   no          <NA>       no       no           0

Aucun commentaire:

Enregistrer un commentaire