jeudi 25 juillet 2019

R: dplyr: transmute where there is a match; else keep

I have a long data frame (about 10 mill. rows) with a unique key (which is a combination of columns) for each row and a vector of values.

I have a short data frame with unique key that matches a few of the keys in the long data frame. These matching keys identify replacement values, supplied in the value column of the second data frame. All the keys in the second data frame should match exactly one key in the first, though perhaps not in order. I want to efficiently produce a new data frame with values of the first where there is no match, and of the second where there is. I feel like there should be a join that does this, but I have not identified it.

df1 <- tibble(let = c("a", "b", "a", "b"), num = c(1, 1, 2, 2), val = c(.1, .2, .3, .4))
df1 <- tibble(let = c("a", "b"), num = c(1, 2), val = c(.5, .6))

df1 %>%
out <- transmute(let = let, num = num, unknown_fn(df2, by = c("let", "num"))

desired output:

let    num   val
"a"    1       .5
"a"    2       .2
"b"    1       .3
"b"    2       .6

Aucun commentaire:

Enregistrer un commentaire