I have census data with 4 columns (Age, Broad Age, Gender, Ethnicity) and multiple rows for each individual. Separate to this I have data on employment type for different age groups and ethnicities. From the datasets I know the number of people in each employment type by "Age" group but only know ethnicity of the people by "Broad Age" group.
For example, I know 23 males aged 16-19, 53 males aged 20-21, and 42 males aged 22-24 work in part time employment, but I only know ethnicity for the "Broad Age" group of 16-24, which I know 38 males are white, and so on.
I am new to R and have managed to do IF statements for if "gender" & if "Age" then they are in Part Time employment but that populates all rows. I am trying to find a way I can specify the distributions according to what I know from the census data so the fifth column populates the correct number of part time employees by "Age" group, but for the "Broad Age" randomly allocates in each ethnic group.
I think I need to create a function but am a little confused with incorporating the distribution part. Any advice would greatly be received!
Example data:
Age Broad Age Gender Ethnicity
16-17 16-24 Male White
16-17 16-24 Male White
16-17 16-24 Male Asian
16-17 16-24 Male Asian
16-17 16-24 Male Asian
18-19 16-24 Male White
18-19 16-24 Male White
18-19 16-24 Male White
18-19 16-24 Male White
18-19 16-24 Male White
Aucun commentaire:
Enregistrer un commentaire