vendredi 7 mai 2021

datatable apply filter on groups only if a given row exists in the group

I am trying to apply a filter to a group in a datatable only if a certain value exists. If it doesn't exist, the filter is not applicable and all the rows of the group are retained. Similar to this

I am looking for a data table version of this answer, if possible, but with some additional criteria.

Firstly, I tried the following:

test <- data.table(grp=c(1,1,1,10,10,10,12,12), c=c("a", "b", "c", "b", "c", "c","a","b"))
test[test[, .I[c=="a" | all(c!="a")], by = grp]$V1]

Suggestions to improve are welcome.

Additional criteria that I am trying to incorporate is to check whether grp belongs to another list. If it belongs to the list, the filter is applicable

lst <- c("1", "8")
test[test[, .I[(c=="a" & grp %in% lst) | all(c!="a")], by = grp]$V1]

Here, the filter applies only to grp value 1 and not to 12 as it does not exist in lst. Instead of returning all rows with grp value 12, it is dropping them entirely. Obviously, it is wrong and I would like to know how to incorporate the condition.

Expected result:

   grp c
1:   1 a
2:  10 b
3:  10 c
4:  10 c
5:  12 a
6:  12 b

For grp=1, it exists in lst and hence filter is applied. For grp=10, no filter is needed as there is not a single row with c="a" For grp=12, filter is applicable BUT as it doesn't belong to lst, the filter is not used.

Thanks

Aucun commentaire:

Enregistrer un commentaire