Consider the following df
:
structure(list(GID7173723 = c("A", "T", "G", "A", "G"), GID4878677 = c("G",
"C", "G", "A", "G"), GID88208 = c("A", "T", "G", "A", "G"), GID346403 = c("A",
"T", "G", "A", "G"), GID268825 = c("G", "C", "G", "A", "G")), row.names = c(NA,
5L), class = "data.frame")
Here is how it looks:
GID7173723 GID4878677 GID88208 GID346403 GID268825
1 A G A A G
2 T C T T C
3 G G G G G
4 A A A A A
5 G G G G G
Now consider two vectors:
ref <- c("A", "T", "G", "A", "G")
alt <- c("G", "C", "T", "C", "A")
And the function:
f = function(x){
ifelse(x==ref,2,x)
ifelse(x==alt,0,x)
}
When I run sapply
just the second ifelse
evaluates:
sapply(dfn,f)
GID7173723 GID4878677 GID88208 GID346403 GID268825
[1,] "A" "0" "A" "A" "0"
[2,] "T" "0" "T" "T" "0"
[3,] "G" "G" "G" "G" "G"
[4,] "A" "A" "A" "A" "A"
[5,] "G" "G" "G" "G" "G"
If I run something like that:
f = function(x){
if (x==ref) {return(2)
}
else if (x==alt) {return(0)
}
else {
return(x)
}
}
I get the warning message:
sapply(dfn,f)
Warning messages:
1: In if (x == ref) { :
the condition has length > 1 and only the first element will be used
2: In if (x == ref) { :
the condition has length > 1 and only the first element will be used
3: In if (x == alt) { :
the condition has length > 1 and only the first element will be used
4: In if (x == ref) { :
the condition has length > 1 and only the first element will be used
5: In if (x == ref) { :
the condition has length > 1 and only the first element will be used
6: In if (x == ref) { :
the condition has length > 1 and only the first element will be used
7: In if (x == alt) { :
the condition has length > 1 and only the first element will be used
I believe the latter function is due to the nature of if else
to not vectorize. I really would like to solve this problem without using neither for
loops nor sweep
but only with if else
statements followed by the apply
family functions.
Aucun commentaire:
Enregistrer un commentaire