vendredi 25 janvier 2019

Avoiding writing a long if-else statement in R

I have run into a situation where I have a data like this:

df <- data.frame(id = 1:1000, 
                   x = sample(0:30, 1000, replace = T), 
                   y = sample(50:10000, 1000, replace = T))

I want to assign another column called z based on multiple conditions i.e.

if x <= 5 & y <= 100, z = 1
if x > 5 & x <= 10 & y <= 100, z = 2
if x > 10 & x <= 12 & y <= 100, z = 3
if x > 10 & x <= 12 & y <= 100, z = 4
if x > 12 & x <= 20 &  y <= 100, z = 5
if x > 20 & x <= 30 &  y <= 100, z = 6
if x <= 5 & y > 100 & y <= 1000, z = 7
if x > 5 & x <= 10 & > 100 & y <= 1000 z = 8
if x > 10 & x <= 12 & > 100 & y <= 1000, z = 9
if x > 10 & x <= 12 & > 100 & y <= 1000, z = 10
if x > 12 & x <= 20 &  > 100 & y <= 1000, z = 11
if x > 20 & x <= 30 &  > 100 & y <= 1000, z = 12
.
.
.

and so. I hope you get the drift.

The obvious solution for me to do is this to write a long ifelse statement something like this;

df %>% mutate(z = ifelse(x <= 5 & y <= 100, 1, 
                  ifelse(x > 5 & x <= 10 & y <= 100, 2,
                  ifelse(x > 10 & x <= 12 & y <= 100, 2))),
          ........... and son on)

You would find that such scripts can be endlessly long and I wondered if there are other ways to achieve this without writing the long ifelse statement.

Aucun commentaire:

Enregistrer un commentaire