mercredi 25 janvier 2017

Conditional Counting Across Tables in R

I am analyzing the effects of pairwise gene interactions (>300K) on fruit fly behavior and have run into an issue. I would like to count how often certain pairs of gene states (i.e., alleles) occur across my 177 lines.

Some play tables:

state1 <-c("A","B","C","A","B","A")
state2 <- c("B","C","D","D","D","C")

df1 <- data.frame(state1,state2)

state <- c("A","B","C","D")
line_1 <- c(0,0,2,0)
line_2 <- c(0,0,0,2)
line_3 <- c(2,2,2,0)
line_4 <- c(0,2,2,2)
line_5 <- c(2,0,0,0)

df2 <- data.frame(state,line_1,line_2,line_3,line_4,line_5)

I am hoping to get an output that returns the number of lines with each state combination (state1=0 and state2=0, state1=0 and state2=2, state1=2 and state2=0, state1=2 and state2=2) and the lines with each of those combinations:

> resultdf
  state1 state2 state10state20 state10state22 state12state20 state12state22       lines00       lines02       lines20       lines22
1      A      B              1              1              2              1        line_1        line_4 line_2,line_5        line_3
2      B      C              1              2              0              2        line_5 line_1,line_2            NA line_3,line_4
3      C      D              1              0              2              2        line_5            NA line_1,line_3 line_2,line_4
4      A      D              1              1              2              1        line_1        line_4 line_3,line_5        line_2
5      B      D              2              1              1              1  line_1,line_5        line_2        line_3        line_4
6      A      C              0              2              1              2            NA line_1,line_4        line_5 line_2,line_3

I started to look into for loops and if statements but found that R does other things better. I am a novice in R (and coding in general) so I wasn't sure where to turn next. Thank you in advance for your help.

Aucun commentaire:

Enregistrer un commentaire