mardi 16 février 2021

Create new variable based on stratified cut-offs using ifelse function in R. Iris dataset example

Im trying to create a new variable e.g, iris$Sepal.Length_above with numeric and species-dependent classification of a variable e.g., sepal length above (1) or below (0) cut-offs. I'll illustrate using iris.

data("iris")
iris_rm <- subset(iris, Species == 'setosa')
iris_2 <- iris[!(iris$Species %in% iris_rm$Species),] #two species

For variables without species-specific cut-offs Ive used the below line

iris_2$Sepal.Width_above <- ifelse(iris_2$Sepal.Width >= 3.0, 1, 0)#1 is above cut-off

Now I want to do the same, but with species-dependent cut-offs. Assume:

#Species "virginica" has Sepal.Length cut-off: 6.5
#Species "versicolor" has Sepal.Length cut-off: 6.0

The best Ive come up with is the below, but there are two problems.

library(dplyr)
iris_2$Sepal.Length_above  <- if (iris_2$Species == 'virginica'){ 
  ifelse(iris_2$Sepal.Length >= 6.5, 1, 0) 
} else (iris_2$Species =='versicolor'){ 
  ifelse(iris_2$Sepal.Length >= 6.0, 1, 0) 
View(iris_2)
#problem 1: 6.0 seems to override the 6.5 for virginica
#problem 2: >= and <= seems to be switched

I would be so greatful for help!

Aucun commentaire:

Enregistrer un commentaire