I would like to conditionally replace the values in one data frame with the values in another using a nested ifelse() statement. But I'm having trouble extending this to the whole data frame using apply. I want to avoid loops and non-base packages if possible.
The first is a data frame with six obs. of 10 character variables:
> snp_test
L1 L2 L3 L4 L5 L6 L7 L8 L9 L10
1 1 2 - 0 2 0 0 0 0 2
2 1 0 - 0 - 1 0 - - 2
3 - - - 0 - - 0 - - 1
4 2 0 0 0 0 - 0 0 0 0
5 2 0 - 0 2 - 0 0 0 1
6 1 0 - 0 0 0 0 0 - 0
The second contains three columns of data (characters; each is two letters separated by a space) relating to each variable:
> locus_test
locus gt0 gt1 gt2
1 L1 G G A A G A
2 L2 T T G G T G
3 L3 A A C C A C
4 L4 T T A A T A
5 L5 G G C C G C
6 L6 C C A A C A
7 L7 T T C C T C
8 L8 A A G G A G
9 L9 A A G G A G
10 L10 G G A A G A
I would like to replace the values in snp_test with the values in locus_test. For example, when L1==1, the 1 is replaced with the corresponding value in locus_test$gt1 ("A A"). When L1==2, the value in the gt2 column is used ("G A").
I can do this for each variable separately:
ifelse(snp_test[,1]==1,locus_test$gt1[locus_test$locus =="L1"],snp_test[,1])
Then I would nest the ifelse, so that the three different values are replaced with their corresponding values in locus_test, e.g.:
ifelse(ifelse(snp_test[,1]==1,locus_test$gt1[locus_test$locus =="L1"],snp_test[,1])==2,locus_test$gt2[locus_test$locus =="L1"],ifelse(snp_test[,1]==1,locus_test$gt1[locus_test$locus =="L1"],snp_test[,1]))
And so on...
But when I apply this over all of the variables in snp_test, i.e.
apply(snp_test,2,function(x)ifelse(x==1,locus_test$gt1,x))
the first six values of locus_test$gt1 are being used as the replacement values, rather than the single value that relates to each column. So I would like to know how I can add the necessary index so that the value that gets replaced in, for example, the L1 column of snp_test can only ever be one of the three variables corresponding to L1 in locus_test.
In other words, how can I specify the subset part of the ifelse:
locus_test$gt1[locus_test$locus =="L1"]
in apply?
Aucun commentaire:
Enregistrer un commentaire