mardi 22 décembre 2020

How can I Index Match in R with NA values and conditional statements?

I searched how to conduct an index match in R and found multiple solutions that never involved conditionals or NA values. This was the most useful post: Index Match in R, but values this person was trying to match were identical. In my case, there is NA values and their matching is conditional.

I have two data sets:

average_income

marketing_data

My goal is to fill all the NA values under marketing_data$Income with the corresponding values in average_income$Mean_Income under the conditionals of (marketing_data$Marital_Status == average_income$Marital_Status) & (marketing_data$Education == average_income$Education.

I tried using a loop like this:

for (i in c(1:nrow(marketing_data), 1:nrow(average_income))) {
if (marketing_data[i, "Education"] == average_income[i, "Education"] &
  marketing_data[i, "Marital_Status"] == average_income[i, "Marital_Status"]) {
  marketing_data[i, "Income"] <- average_income[i,"Mean_Income"]} 
}

But I got this error:

Error in if (marketing_data[i, "Education"] == average_income[i, "Education"] & : missing value where TRUE/FALSE needed 
Traceback:

Long story short, how can I impute the NA values under the Income column in marketing_data using the values from the Mean_Income column in average_income based off the conditionals of Education & Marital_Status?

Can it be done with a loop?

Or do I need to use Index Match?

P.S. This is my first question on stack overflow. If you have any feedback on my question and how I could phrase/ask it better, please let me know!

Aucun commentaire:

Enregistrer un commentaire