lundi 12 janvier 2015

Create column with grouped values based on another column in dplyr

I'm sure this has been asked before, but I don't know what to search for, so I apologise in advance.


Let's say that I have the following data frame:



grades <- data.frame(a = 1:40, b = sample(45:100, 40))


Using deplyr, I want to create a new variable that indicates the grade the student received, based on the following criteria: 90-100 = excellent, 80-90 = very good, etc.


I thought I could use the following to get that result with nestling ifelse() inside of mutate():



grades %>%
mutate(ifelse(b >= 90, "excellent"),
ifelse(b >= 80 & b < 90, "very_good"),
ifelse(b >= 70 & b < 80, "fair"),
ifelse(b >= 60 & b < 70, "poor", "fail"))


This doesn't work, as I get the error message "argument no is missing, with no default"). I thought the "no" would be the "fail" at the end, but obviously I'm getting the syntax wrong.


I can get this to get if I first filter the original data individually, and then call ifelse, as follows:



a <- grades %>%
filter( b >= 90) %>%
mutate(final = ifelse(b >= 90, "excellent"))


and the rbind a, b, c, etc. Obviously,this isn't how I want to do it, but I wanted to understand the syntax of ifelse(). I'm guessing the latter works because there aren't any values that don't fill the criteria, but I still can't figure out how to get it to work when there is more than one ifelse.


Aucun commentaire:

Enregistrer un commentaire