I am analyzing the effects of pairwise gene interactions (>300K) on fruit fly behavior and have run into an issue. I would like to count how often certain pairs of gene states (i.e., alleles) occur across my 177 lines.
Some play tables:
state1 <-c("A","B","C","A","B","A")
state2 <- c("B","C","D","D","D","C")
df1 <- data.frame(state1,state2)
state <- c("A","B","C","D")
line_1 <- c(0,0,2,0)
line_2 <- c(0,0,0,2)
line_3 <- c(2,2,2,0)
line_4 <- c(0,2,2,2)
line_5 <- c(2,0,0,0)
df2 <- data.frame(state,line_1,line_2,line_3,line_4,line_5)
I am hoping to get an output that returns the number of lines with each state combination (state1=0 and state2=0, state1=0 and state2=2, state1=2 and state2=0, state1=2 and state2=2) and the lines with each of those combinations:
> resultdf
state1 state2 state10state20 state10state22 state12state20 state12state22 lines00 lines02 lines20 lines22
1 A B 1 1 2 1 line_1 line_4 line_2,line_5 line_3
2 B C 1 2 0 2 line_5 line_1,line_2 NA line_3,line_4
3 C D 1 0 2 2 line_5 NA line_1,line_3 line_2,line_4
4 A D 1 1 2 1 line_1 line_4 line_3,line_5 line_2
5 B D 2 1 1 1 line_1,line_5 line_2 line_3 line_4
6 A C 0 2 1 2 NA line_1,line_4 line_5 line_2,line_3
I started to look into for loops and if statements but found that R does other things better. I am a novice in R (and coding in general) so I wasn't sure where to turn next. Thank you in advance for your help.
Aucun commentaire:
Enregistrer un commentaire