lundi 29 août 2016

R: Improving Nested ifelse Statements & Multiple Patterns

I am continuing to work on some data cleaning practice with some animal shelter data. My goal here is to shrink down the number of breed categories.

I am using each breed category as a partial pattern match against the outgoing$Single.Breed data frame column. So, there are cases where the breed will just be Chihuahua, but it may also be Long Hair Chihuahua. (Hence, my use of grepl.) Thus, anything containing a breed category would be represented in a different column by said category. Furthermore, I also need to add the cat breed categories...making for an even messier bunch of code.

The code below is my "solution", but it's quite clunky. Is there a better, slicker and/or more efficient way to accomplish this?

BreedCategories <- ifelse(outgoing$New.Type == "Dog",
           ifelse(grepl("Chihuahua",outgoing$Single.Breed, ignore.case = TRUE), "Chihuahua",
           ifelse(grepl("Pit Bull",outgoing$Single.Breed, ignore.case = TRUE), "Pit Bull",
           ifelse(grepl("Terrier",outgoing$Single.Breed, ignore.case = TRUE), "Terrier",
           ifelse(grepl("Shepherd",outgoing$Single.Breed, ignore.case = TRUE), "Shepherd",
           ifelse(grepl("Poodle",outgoing$Single.Breed, ignore.case = TRUE), "Poodle",
           ifelse(grepl("Labrador|Retriever",outgoing$Single.Breed, ignore.case = TRUE),"Labrador",
           "Other")))))),"Cat")

Aucun commentaire:

Enregistrer un commentaire