I have a long format dataset and I want to create a variable that means what was their parities in the beginning of the reference period.
This is a sample of the data:
key <- c(12, 12, 12, 12, 13, 13, 13, 13)
Year <- c(96, 97, 98, 99,96, 97, 98, 99)
CHBORN <- c(1, 1, 1, 1, 4, 4, 4, 4)
CHYEAR <- c(0, 0, 1, 0, 1, 0, 1, 1)
birth_order <- c(0, 0, 1, 0, 4, 0, 3, 2)
df <- data.frame(key, Year, CHBORN, CHYEAR, birth_order)
#CHBORN is until 99 how many children the woman had.
#CHYEAR is the number of children born in that year.
#Birthorder means: if a woman had a child in that year, which order that birth can be classified.
I need to create a variable pa, which refer to the parities that the women have in the beginning of the reference period, which is not necessarily the same as birth order.
I tried this code already:
df[, pa := ifelse(birth_order > 0, birth_order - CHYEAR, lag(pa)), by = 'key']
I expect this result:
key Year CHBORN CHYEAR birth_order pa
1 12 96 1 0 0 0
2 12 97 1 0 0 0
3 12 98 1 1 1 0
4 12 99 1 0 0 1
5 13 96 4 1 4 3
6 13 97 4 0 0 3
7 13 98 4 1 3 2
8 13 99 4 1 2 1
Thanks for the help!
Aucun commentaire:
Enregistrer un commentaire