jeudi 4 avril 2019

How can I create variables using several condition statements?

I have a long format dataset and I want to create a variable that means what was their parities in the beginning of the reference period.

This is a sample of the data:

key <- c(12, 12, 12, 12, 13, 13, 13, 13)
Year <- c(96, 97, 98, 99,96, 97, 98, 99)
CHBORN <- c(1, 1, 1, 1, 4, 4, 4, 4)
CHYEAR <- c(0, 0, 1, 0, 1, 0, 1, 1)
birth_order <- c(0, 0, 1, 0, 4, 0, 3, 2)

df <- data.frame(key, Year, CHBORN, CHYEAR, birth_order)

#CHBORN is until 99 how many children the woman had.
#CHYEAR is the number of children born in that year. 
#Birthorder means: if a woman had a child in that year, which order that birth can be classified.


I need to create a variable pa, which refer to the parities that the women have in the beginning of the reference period, which is not necessarily the same as birth order.

I tried this code already:

df[, pa := ifelse(birth_order > 0, birth_order - CHYEAR, lag(pa)), by = 'key']

I expect this result:

  key Year CHBORN CHYEAR birth_order pa
1  12   96      1      0           0  0
2  12   97      1      0           0  0
3  12   98      1      1           1  0
4  12   99      1      0           0  1
5  13   96      4      1           4  3
6  13   97      4      0           0  3
7  13   98      4      1           3  2
8  13   99      4      1           2  1

Thanks for the help!

Aucun commentaire:

Enregistrer un commentaire