I have an individual-level dataset with demographic information of each person. It also provides a unique household id along with other variables:
id if_adult (>18 yrs old) marital_status
1 1 Single
1 1 Single
2 1 Married
2 1 Married
2 0 Married
Each household has at least one adult who is single or two adults who are either married or single. Some households also have children. I am trying to create a dummy variable called "unmarried couple" that will correctly categorize a household that has exactly two single adults. Obviously, there are duplicate rows with the same household id so I want each to be labeled correctly. Currently, the code I have is:
individual_data$`unmarried couple` <- ifelse((individual_data$if_adult ==
"1" & individual_data$id == individual_data$id) &
individual_data$marital_status == "Single", "1","0")
But this incorrectly categorizes the single-person led households (i.e. single moms and single dads with children) as being unmarried couples. This is key - if I can figure this out then it will be accurate. To rectify this issue, I am attempting to create a new variable that indicates the total number of adults per household:
id if_adult (>18 yrs old) marital_status total_adults
1 1 Single 2
1 1 Single 2
2 1 Married 2
2 1 Married 2
2 0 Married 2
Then create my desired variable by filtering out the single-led households and setting the condition as having at least two adults
individual_data$`unmarried couple` <- ifelse((individual_data$total_adults
== 2 & individual_data$id == individual_data$id) &
individual_data$marital_status == "Single", "1","0")
I ultimately want it to look like this and for the rest of the data:
id if_adult marital_status total_adults unmarried couple
1 1 Single 2 1
1 1 Single 2 1
2 1 Married 2 0
2 1 Married 2 0
2 0 Married 2 0
Thanks in advance for the feedback and suggestions
Aucun commentaire:
Enregistrer un commentaire