dimanche 22 août 2021

Filtering out with conditions of descending priority

I'm having trouble with selecting rows based on conditions of descending priority. I've tried finding out a solution but just can't. It seems like a simple task but I just cant figure it out.

This is just a general example of what I would like to do.

structure(list(type = c(100815L, 100815L, 100815L, 100815L, 
100815L, 100815L, 100815L), x = structure(c(1L, 
1L, 1L, 2L, 1L, 2L, 2L), .Label = c("No", "Yes"), class = "factor"), 
    y = c(1.51098844290943, 2.31001922745969, 1.52639281812227, 
    0, 0, 0, 0), z = c(0, 0, 0, 25, 0, 50, 25)), row.names = c(NA, 
-7L), class = c("tbl_df", "tbl", "data.frame"))


group <- group %>%
  group_by(type) %>% 
     filter(sum(x == "Yes" &
                y == min(y[x == "Yes"]) &
                z == max(z[y == min(y[x == "Yes"])])) == 1)

So basically I want to filter from a large sample the groups that has exactly one such case. i.e. there are no ties:

  • When x = "Yes" AND
    • given that x = "Yes" choose the minimum y of those who have x = "Yes" AND
      • from this final smallest group select z that has the largest value.

Basically it's also possible that no such final value exists.

This is a simplified example of a problem I constantly run into. I have a larger dataset where I need assign values to cells based on multiple conditions that are of descending priority. I want to be able to stack conditions sequentially.

Aucun commentaire:

Enregistrer un commentaire