I searched how to conduct an index match in R and found multiple solutions that never involved conditionals or NA values. This was the most useful post: Index Match in R, but values this person was trying to match were identical. In my case, there is NA values and their matching is conditional.
I have two data sets:
My goal is to fill all the NA values under marketing_data$Income
with the corresponding values in average_income$Mean_Income
under the conditionals of (marketing_data$Marital_Status
== average_income$Marital_Status
) & (marketing_data$Education
== average_income$Education
.
I tried using a loop like this:
for (i in c(1:nrow(marketing_data), 1:nrow(average_income))) {
if (marketing_data[i, "Education"] == average_income[i, "Education"] &
marketing_data[i, "Marital_Status"] == average_income[i, "Marital_Status"]) {
marketing_data[i, "Income"] <- average_income[i,"Mean_Income"]}
}
But I got this error:
Error in if (marketing_data[i, "Education"] == average_income[i, "Education"] & : missing value where TRUE/FALSE needed
Traceback:
Long story short, how can I impute the NA values under the Income
column in marketing_data
using the values from the Mean_Income
column in average_income
based off the conditionals of Education
& Marital_Status
?
Can it be done with a loop?
Or do I need to use Index Match?
P.S. This is my first question on stack overflow. If you have any feedback on my question and how I could phrase/ask it better, please let me know!
Aucun commentaire:
Enregistrer un commentaire