my [simplified] data looks like this:
id = 1:50
first_active = sample(1:20, 50, replace = TRUE)
df = data.frame(cbind(id, first_active))
for(i in 1:35) {
df[paste0("week", i,sep="")] = sample(0:1, 50, replace = TRUE)
}
I'm writing a for loop that would create a p1, p2,...p35 variables and populate them with the following:
(example for creating a p4 column that would apply to p1-35):
df %>%
mutate(
p4 = ifelse(week4 > 0, "active",
ifelse(first_order<4 & p3 == "lapsed2", "lapsed3",
ifelse(first_order<4 & p3 == "lapsed1", "lapsed2",
ifelse(first_order<4 & p3 == "active", "lapsed1", "NA")))))
In essence, the outcome should look like this, for columns p1-p35:
reference data
head(markov_df[,1:37])
id first_active week1 week2 week3 week4 week5 week6 week7 week8 week9 week10 week11 week12 week13 week14 week15 week16
1 14 1 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0
2 3 0 1 1 0 1 0 0 1 0 1 0 0 1 0 0 1
3 1 1 1 1 0 1 0 0 0 1 0 0 0 1 0 1 0
4 3 0 1 0 1 1 0 1 1 1 0 1 1 1 0 0 0
5 1 1 0 1 1 0 1 1 0 1 1 1 0 0 0 1 1
6 14 0 1 1 0 0 1 0 1 0 1 1 1 1 0 0 1
week17 week18 week19 week20 week21 week22 week23 week24 week25 week26 week27 week28 week29 week30 week31 week32 week33
1 0 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0
2 0 0 1 0 0 1 1 0 1 0 1 1 1 0 0 0 1
3 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 0
4 0 1 0 1 1 0 1 0 1 1 1 0 1 0 0 0 0
5 0 1 1 1 0 1 0 1 0 0 1 1 1 1 1 0 1
6 0 0 1 1 0 0 0 0 0 1 0 1 0 1 1 1 1
week34 week35
0 0
1 1
0 1
0 0
1 0
0 0
desired output data (cols 1 - 15)
head(markov_df[,38:52])
p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15
active NA active active NA NA NA NA NA active active NA NA NA NA
NA active active lapsed1 active lapsed1 lapsed2 active lapsed1 active lapsed1 lapsed2 active lapsed1 lapsed2
active active active lapsed1 active lapsed1 lapsed2 lapsed3 active lapsed1 lapsed2 lapsed3 active lapsed1 active
NA active NA active active lapsed1 active active active lapsed1 active active active lapsed1 lapsed2
active lapsed1 active active lapsed1 active active lapsed1 active active active lapsed1 lapsed2 lapsed3 active
NA active active NA NA active NA active NA active active active active NA NA
What I managed to get so far is:
i = 2:35
j = 1:34
for(ind in seq_along(i)) {
markov_function[paste0("p", i,sep="")] = ifelse(markov_function[paste0("week", i, sep="")]> 0, "active",
ifelse(markov_function["first_order"] < i & markov_function[paste0("week", i-1, sep="")] == paste0("lapsed", j, sep=""),
paste0("lapsed", j+1, sep=""), NA))
}
but I get the error:
Error in ifelse(markov_function["first_order"] < i & markov_function[paste0("week", :
binary operation on non-conformable arrays
I suspect I'm missing something basic, I'll be grateful for some help here, thanks! Kasia
Aucun commentaire:
Enregistrer un commentaire