I have a dataset with over 100 variables. I have one variable called "ZIP_CODE" and another one with the names of the districts said ZIP code belongs to called "DISTRICT", corresponding to the person's address.
Thing is, I need to "clean" the names of the districts that are written in the wrong way. So I have dataframe with a list of the state's zip codes and the districts, and another list with only the districts names.
I thought about doing something like this:
test$district_new <- ifelse(test$district[!grepl(paste(list_districts, collapse="|"), test$district), ]), left_join(test, zip_cope, by = "NU_ZIP", test$district)
The main idea is: if the district from my dataset isn't present in my district list, then I want to "join" the zip codes from my dataset with the zip_code list so I can replace the wrong name by the right one present in the list.
But this code isn't working for me.
Aucun commentaire:
Enregistrer un commentaire