jeudi 22 juillet 2021

R add values to tibble column in groups

I want to add a new column to a tibble of experiment data with multiple rows per participant, where the values for the new column are calculated for each participant in turn.

Let's assume the following dummy example:

my_data <- tibble(
  participant_id = c(rep(1, 4), rep(2, 4)),
  suffix = c('su', 'bi', 'fa', 'su', 'va', 'va', 'bi', 'su')
)

On a single vector of suffixes (i.e. only one participant), I have been able to use the following code to give me a corresponding vector of ones and zeros (1 where the suffix is unique, 0 where it's repeated):

ifelse(!suffix %in% suffix[duplicated(suffix)], 1, 0)

But I can't work out how to do this for each participant in turn to get a column containing a 1 where a suffix is unique for that participant and a 0 where it's repeated for that participant.

The only (ugly) way I can think of to do it is to create a new dummy column which glues together participant_id and suffix (so the values would be e.g. '1_su', '1_bi' etc.) and run the ifelse statement on that column. Is there a nicer way to do it just grouping by participant_id?

Aucun commentaire:

Enregistrer un commentaire