mercredi 17 juin 2020

Tidyverse: case_when() not returning correct value

For a set of columns, I wish to detect which column has the max value. If the value between columns is equal, then I would like to use a weight to decide which column to select. I've tried implementing this using case_when() however it does not work. I will show this problem below using an example dataset...

Let's say I have a dataset which includes these three columns (A, B, C) that correspond to the amount of Apple, Bananas or Carrots a child eats per day.

For row I want to log the most consumed food (i.e., column with the highest value). If the value between any column is equal (e.g., 1 apple and 1 banana), then apply the following rank. Apple > Banana > Carrot, whereby if a child eats 1 Apple and 1 Banana, the log will show Apple.

I've tried implementing this in R using by using pairs of if_else statements with case_when(). However, it does not return the correct result. For example, the final row should be classified as Apple, not carrot. I'm not sure what I'm doing wrong. Please provide a Tidyverse solution to this issue and if possible, explain why my approach did not work.

library(tidyverse)

A <- c(1,1,3,3)
B <- c(2,3,1,1)
C <- c(1,1,1,2)
df <- data.frame(A,B,C)

top_food <- df %>% 
  mutate(highest = case_when(
    C > B ~ "carrot", # if carrot > banana
    C > A ~ "carrot", # if carrot > apple 
    B > A ~ "banana", # if banana > apple 
    B >= A ~ "banana", # if banana >= carrot
    A >= B ~ "apple", # if apple  >= banana
    A >= C ~ "apple" # if apple >= carrot
  )) 

> | A | B | C | HIGHEST |  |
> | 1 | 2 | 1 | banana  |  |
> | 1 | 3 | 1 | banana  |  |
> | 3 | 1 | 1 | apple   |  |
>   3 | 1 | 2 | carrot  |  |

Notes: - this is an example dataset. - I'm open to solutions with different functions but please provide Tidyverse answers as this is how I am learning R. If possible, please explain why my approach using case_when() did not work so I can learn. - It's important to maintain the shape/layout of the dataset so it can be used in software outside of R, so please do not convert to long format.

Aucun commentaire:

Enregistrer un commentaire