I've got the customer activity data in the following format:
id = rep(1:2, each = 7)
week = rep(22:28, 2)
orders_num = c(NA, 1,0,0,0,3,0,1,4,0,0,1,0,0)
orders_char = c("none", "1", "none", "none", "none", "2+", "none", "1", "2+", "none", "none","1", "none", "none")
activity =c(NA, "active", "lapsed1", "lapsed2", "lapsed3", "active", "lapsed1", "active", "active", "lapsed1", "lapsed2" , "active", "lapsed1", "lapsed2")
df = data.frame(cbind(id, week, orders_num, orders_char, activity))
df
id week orders_num orders_char activity
1 22 <NA> none <NA>
1 23 1 1 active
1 24 0 none lapsed1
1 25 0 none lapsed2
1 26 0 none lapsed3
1 27 3 2+ active
1 28 0 none lapsed1
2 22 1 1 active
2 23 4 2+ active
2 24 0 none lapsed1
2 25 0 none lapsed2
2 26 1 1 active
2 27 0 none lapsed1
2 28 0 none lapsed2
I'm trying to create a column that would return a combination of activity
and orders_char
if activity == "active"
and a respective 'lapsed'
category with orders_char
corresponding to the most recent 'active'
status:
df_solution
id week orders_num orders_char activity final
1 22 <NA> none <NA> <NA>
1 23 1 1 active active_1
1 24 0 none lapsed1 lapsed1_1
1 25 0 none lapsed2 lapsed2_1
1 26 0 none lapsed3 lapsed3_1
1 27 3 2+ active active_2+
1 28 0 none lapsed1 lapsed1_2+
2 22 1 1 active active_1
2 23 4 2+ active active_2+
2 24 0 none lapsed1 lapsed1_2+
2 25 0 none lapsed2 lapsed2_2+
2 26 1 1 active active_1
2 27 0 none lapsed1 lapsed1_1
2 28 0 none lapsed2 lapsed2_1
Obviously, simply referring to the previous row is not going to work in cases where lapsed
periods last longer:
df %>% group_by(id) %>% arrange(id, week) %>%
mutate(final = ifelse(activity == "active", paste(activity, orders_char, sep="_"),
ifelse(grepl("lapsed", activity), paste(activity, lag(orders_char), sep="_"), NA)))
id week orders_num orders_char activity final
1 22 NA none NA <NA>
1 23 1 1 active active_1
1 24 0 none lapsed1 lapsed1_1
1 25 0 none lapsed2 lapsed2_none
1 26 0 none lapsed3 lapsed3_none
1 27 3 2+ active active_2+
1 28 0 none lapsed1 lapsed1_2+
2 22 1 1 active active_1
2 23 4 2+ active active_2+
2 24 0 none lapsed1 lapsed1_2+
2 25 0 none lapsed2 lapsed2_none
2 26 1 1 active active_1
2 27 0 none lapsed1 lapsed1_1
2 28 0 none lapsed2 lapsed2_none
Is there any way I could "fix" orders_char
value from one active period to the other?
Thanks for your suggestions!
Aucun commentaire:
Enregistrer un commentaire