mercredi 21 octobre 2015

Nested if statements for distributing data in R

I have census data with 4 columns (Age, Broad Age, Gender, Ethnicity) and multiple rows for each individual. Separate to this I have data on employment type for different age groups and ethnicities. From the datasets I know the number of people in each employment type by "Age" group but only know ethnicity of the people by "Broad Age" group.

For example, I know 23 males aged 16-19, 53 males aged 20-21, and 42 males aged 22-24 work in part time employment, but I only know ethnicity for the "Broad Age" group of 16-24, which I know 38 males are white, and so on.

I am new to R and have managed to do IF statements for if "gender" & if "Age" then they are in Part Time employment but that populates all rows. I am trying to find a way I can specify the distributions according to what I know from the census data so the fifth column populates the correct number of part time employees by "Age" group, but for the "Broad Age" randomly allocates in each ethnic group.

I think I need to create a function but am a little confused with incorporating the distribution part. Any advice would greatly be received!

Example data:

Age     Broad Age       Gender      Ethnicity
16-17   16-24       Male            White
16-17   16-24       Male            White
16-17   16-24       Male            Asian
16-17   16-24       Male            Asian
16-17   16-24       Male            Asian
18-19   16-24       Male            White
18-19   16-24       Male            White
18-19   16-24       Male            White
18-19   16-24       Male            White
18-19   16-24       Male            White

Aucun commentaire:

Enregistrer un commentaire