lundi 1 mai 2017

dplyr mutate with ifelse if true get what's in column x

I have a dataframe with a feature called place and it has many levels. My goal is to keep the top ten levels then replace all the others with "other"

topten.area <- names(sort(table(raw.train$place), decreasing = T)[1:10])

This returns a character vector of names of the top ten levels.

> topten.area
 [1] "Glasgow"      "Edinburgh"               "Aberdeen"    
 [4] "Dundee"     "Stirling"  "Inverness"                  
 [7] "Perth"                 "Aye"                 "Dingwall"                 
[10] "Avoch"

p.train <- raw.train %>%
  mutate(place = ifelse(place %in% topten.area, place, "other"))

I had hoped to see feature "place" update where it's values are either one of the top ten or "other". Instead I get this:

> unique(p.train$place)
 [1] "other" "66"    "61"    "49"    "73"    "135"   "103"   "95"    "106"   "88"    "104"  

Aucun commentaire:

Enregistrer un commentaire