mardi 5 avril 2016

Pick a column to multiply with, contingent on value of other variables

I am still doing my first footsteps with R and found SO to be a great tool for learning more and finding answers to my questions. For this one i though did not manage to find any good solution here.

I have a dataframe that can be simplified to this structure:

df <- data.frame(v1 = rep(1:2, times=3), 
v2 = c("A","B","B","A","B","A"), 
v3 = sample(1:6), 
xA_1 = sample(1:6), 
xA_2 = sample(1:6),
xB_1 = sample(1:6), xB_2 = sample(1:6))

df thus looks like this:

    v1 v2 v3   xA_1 xA_2 xB_1 xB_2
1:  1  A  6    4    5    4    2
2:  2  B  3    6    3    3    5
3:  1  B  5    1    1    6    1
4:  2  A  4    5    4    2    3
5:  1  B  1    2    2    5    6
6:  2  A  2    3    6    1    4

I now want R to create a fourth variable, which is dependent on the values of v1 and v2. I achieve this by using the following code:

df <- data.table(df)
df[, v4 := ifelse(v1 == 1 & v2 == "A", v3*xA_1, 
        ifelse(v1 == 1 & v2 == "B", v3*xB_1,
         ifelse(v1 == 2 & v2 == "A", v3*xA_2,
          ifelse(v1 == 2 & v2 == "B", v3*xB_2, v3*1))))]

So v4 is created by multiplying v3 with the column that contains the v1 and the v2 value (e.g. for row 1: v1=1 and v2=A thus multiply v3=6 with xA_1=4 -> 24).

df$v4
[1] 24 15 30 16  5 12

Obviuosly, my ifelse approach is tedious when v1 and v2 in fact have many more different values than they have in this example. So I am looking for an efficient way to tell R if v1 == y & v2 == z, multiply v3 with column xy_z.

I tried writing a for-loop, writing a function that has y and z as index and using the apply function. However none of this worked as wanted.

I appreciate any ideas!

Aucun commentaire:

Enregistrer un commentaire