I have a dataset like this
dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
Col1 = rep(c("B","S","S","B"), 4),
Col2 = rep(c(1,2,3,4), 4),
Col3 = rep(c(0.1,0.2,0.3,0.4), 4))
I'm trying to create a fourth column as shown below
dat1 <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
Col1 = rep(c("B","S","S","B"), 4),
Col2 = rep(c(1,2,3,4), 4),
Col3 = rep(c(0.1,0.2,0.3,0.4), 4),
Col4 = c(1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4))
What I have tried till now,
d1 <- dat %>%
group_by(Col0) %>%
mutate(Col4 = if_else(Col1 == 'B', Col2,
if_else(Col1 == 'S' & lag(Col1 == "B"), lag(Col2)- Col3*lag(Col2), 0)))
d1
The Answer I'm getting is not what is in Col4, which is desired. The condition for getting Col4 is :
if Col1 is B then get the value of Col2 as it is,
if Col1 is S & Previous Value of Col1 is B then 1-(0.2*1) which is equal to 0.8
if Col1 is S & Previous Value of Col1 is S as well then (1+0.8) -((1+0.8)*0.3) which is 1.26
Basically, it's like first performing difference and then performing cumulative sum including the difference and so on.
For now, I have taken a simple example to understand what I'm trying to achieve, the actual data-set has more than 1 million Obs. and Several Thousand Groups and what's worse is that the Combination of 'B' & 'S' alter. Like in some groups it's B,B,S,S
and So on...
Any Help on this will be appreciated as I have tried several things other than if_else()
and seen many conditional cumulative sum Ques as well but to no avail.
I think the same could be done easily in Excel with SUMIF() Function, but i need to do this with R
Aucun commentaire:
Enregistrer un commentaire