vendredi 31 juillet 2020

Why second ifelse not evaluated in R and why if else does not vectorize?

Consider the following df:

structure(list(GID7173723 = c("A", "T", "G", "A", "G"), GID4878677 = c("G", 
"C", "G", "A", "G"), GID88208 = c("A", "T", "G", "A", "G"), GID346403 = c("A", 
"T", "G", "A", "G"), GID268825 = c("G", "C", "G", "A", "G")), row.names = c(NA, 
5L), class = "data.frame")

Here is how it looks:

  GID7173723 GID4878677 GID88208 GID346403 GID268825
1          A          G        A         A         G
2          T          C        T         T         C
3          G          G        G         G         G
4          A          A        A         A         A
5          G          G        G         G         G

Now consider two vectors:

ref <- c("A", "T", "G", "A", "G")
alt <- c("G", "C", "T", "C", "A")

And the function:

f = function(x){
  ifelse(x==ref,2,x)
  ifelse(x==alt,0,x)
}

When I run sapply just the second ifelse evaluates:

sapply(dfn,f)

     GID7173723 GID4878677 GID88208 GID346403 GID268825
[1,] "A"        "0"        "A"      "A"       "0"      
[2,] "T"        "0"        "T"      "T"       "0"      
[3,] "G"        "G"        "G"      "G"       "G"      
[4,] "A"        "A"        "A"      "A"       "A"      
[5,] "G"        "G"        "G"      "G"       "G"    

If I run something like that:

f = function(x){
  if (x==ref) {return(2)
    
  }
  else if (x==alt) {return(0)
    
  }
  else {
    return(x)
  }
} 

I get the warning message:

sapply(dfn,f)

Warning messages:
1: In if (x == ref) { :
  the condition has length > 1 and only the first element will be used
2: In if (x == ref) { :
  the condition has length > 1 and only the first element will be used
3: In if (x == alt) { :
  the condition has length > 1 and only the first element will be used
4: In if (x == ref) { :
  the condition has length > 1 and only the first element will be used
5: In if (x == ref) { :
  the condition has length > 1 and only the first element will be used
6: In if (x == ref) { :
  the condition has length > 1 and only the first element will be used
7: In if (x == alt) { :
  the condition has length > 1 and only the first element will be used

I believe the latter function is due to the nature of if else to not vectorize. I really would like to solve this problem without using neither for loops nor sweep but only with if else statements followed by the apply family functions.

Aucun commentaire:

Enregistrer un commentaire