mercredi 7 juin 2017

Direct way of telling ifelse to ignore NA

As explained here when the test condition in ifelse(test, yes, no) is NA, the evaluation is also NA. Hence the following returns...

df <- data.frame(a = c(1, 1, NA, NA, NA ,NA),
                 b = c(NA, NA, 1, 1, NA, NA),
                 c = c(rep(NA, 4), 1, 1))
ifelse(df$a==1, "a==1", 
    ifelse(df$b==1, "b==1", 
        ifelse(df$c==1, "c==1", NA)))
#[1] "a==1" "a==1" NA     NA     NA     NA    

... instead of the desired

#[1] "a==1" "a==1" "b==1" "b==1"  "c==1" "c==1" 

As suggested by Cath, I can circumvent this problem by formally specifying that the test condition should not include NA:

ifelse(df$a==1 &  !is.na(df$a), "a==1", 
    ifelse(df$b==1 & !is.na(df$b), "b==1", 
        ifelse(df$c==1 & !is.na(df$c), "c==1", NA)))

However, as akrun also noted, this solution becomes rather lengthy with increasing number of columns.


A workaround would be to first replace all NAs with a value not present in the data.frame (e.g, 2 in this case):

df_noNA <- data.frame(a = c(1, 1, 2, 2, 2 ,2),
                 b = c(2, 2, 1, 1, 2, 2),
                 c = c(rep(2, 4), 1, 1))

ifelse(df_noNA$a==1, "a==1", 
    ifelse(df_noNA$b==1, "b==1", 
        ifelse(df_noNA$c==1, "c==1", NA)))
#[1] "a==1" "a==1" "b==1" "b==1"  "c==1" "c==1" 

However, I was wondering if there was a more direct way to tell ifelse to ignore NAs? Or is writing a function for & !is.na the most direct way?

ignorena <- function(column) {
        column ==1 & !is.na(column)
}
ifelse(ignorena(df$a), "a==1", 
    ifelse(ignorena(df$b), "b==1", 
        ifelse(ignorena(df$c), "c==1", NA)))
#[1] "a==1" "a==1" "b==1" "b==1"  "c==1" "c==1" 

Aucun commentaire:

Enregistrer un commentaire