jeudi 17 décembre 2020

Applying if else statements involving two dataframe columns in R

I am trying to modify a dataframe of two columns, to add a third that returns four possible expressions depending on the contents of the other columns (i.e. whether each is positive or negative).

I have tried a couple of approaches, the 'mutate' function in dplyr as well as sapply. Unfortunately I seem to be missing something as I get the error "the condition has length > 1 and only the first element will be used". So only the first iteration is applied to each row in the new column.

A reproducible example (of the mutate approach I've tried) is as follows:

Costs <- c(2, -5, -7, 3, 12)
Outcomes <- c(-2, 5, -7, 3, -2)

results <- as.data.frame(cbind(Costs, Outcomes))
results

quadrant <- function(cost,outcome) {
        if (costs < 0 &
            outcomes < 0) {
                "SW Quadrant"
        }
        else if (costs<0 & outcomes>0){
                "Dominant"
        } 
        else if (costs>0 & outcomes<0){
                "Dominated"
        }
        else{""}
}


results <- mutate(results,Quadrant = quadrant(Costs,Outcomes)
        )

The full warning message is:

Warning messages: 1: Problem with mutate() input Quadrant. i the condition has length > 1 and only the first element will be used i Input Quadrant is quadrant(results$Costs, results$Outcomes). 2: In if (costs < 0 & outcomes < 0) { : the condition has length > 1 and only the first element will be used 3: Problem with mutate() input Quadrant. i the condition has length > 1 and only the first element will be used i Input Quadrant is quadrant(results$Costs, results$Outcomes). 4: In if (costs < 0 & outcomes > 0) { : the condition has length > 1 and only the first element will be used 5: Problem with mutate() input Quadrant. i the condition has length > 1 and only the first element will be used i Input Quadrant is quadrant(results$Costs, results$Outcomes). 6: In if (costs > 0 & outcomes < 0) { : the condition has length > 1 and only the first element will be used<

My attempt at the sapply function:

results <- sapply(results$Quadrant,quadrant(results$Costs,results$Outcomes))

Leads to the following error, with consistent warning messages to the mutate approach.

Error in get(as.character(FUN), mode = "function", envir = envir) : object 'Dominated' of mode 'function' was not found

I'm sure I'm missing something obvious here. Grateful for any suggestions.

Aucun commentaire:

Enregistrer un commentaire