lundi 31 décembre 2018

Removing duplicates in R based on condition

I need to embed a condition in a remove duplicates function. I am working with large student database from South Africa, a highly multilingual country. Last week you guys gave me the code to remove duplicates caused by retakes, but I now realise my language exam data shows some students offering more than 2 different languages. The source data, simplified looks like this

STUDID   MATSUBJ     SCORE
101      AFRIKAANSB   1
101      AFRIKAANSB   4
102      ENGLISHB     2
102      ISIZULUB     7
102      ENGLISHB     5

The result file I need is

STUDID   MATSUBJ    SCORE  flagextra
101      AFRIKAANS   4
102      ENGLISH     5
102      ISIZULUB    7     1

I need to flag the extra language so that I can see what languages they are and make new category for this

Aucun commentaire:

Enregistrer un commentaire