mardi 20 juillet 2021

How to distill 15 unique variables into 4 chosen variables in R?

I have a df:

Weight   Age     Race
56       10      WHITE - RUSSIAN 
190      54      HISPANIC/LATINO - CUBAN
99       14      SOUTH AMERICAN
80       9       BLACK/AFRICAN 
200      19      ASIAN - CHINESE
201      20      ASIAN
180      90      WHITE
17       2       UNKNOWN/NOT SPECIFIED 
100      10      BLACK/CAPE VERDEAN 
110      11      
109      9       AMERICAN INDIAN/ALASKA NATIVE 

The Race Category has 15 unique options with output of unique(df$Race):

 [1] WHITE                                   
 [2] WHITE - RUSSIAN                         
 [3] ASIAN                                   
 [4] BLACK/AFRICAN AMERICAN                  
 [5] OTHER                                   
 [6] UNKNOWN/NOT SPECIFIED                   
 [7] BLACK/AFRICAN                           
 [8] HISPANIC/LATINO - CUBAN                 
 [9] WHITE - OTHER EUROPEAN                  
[10] AMERICAN INDIAN/ALASKA NATIVE           
[11] SOUTH AMERICAN                          
[12] ASIAN - CHINESE                         
[13] BLACK/CAPE VERDEAN                      
[14] HISPANIC/LATINO - PUERTO RICAN          
[15]      

I'd like to change these into five buckets: "White" with [1,2,9], "Black" with [4,7,13], "Hispanic" with [8,11,14], "Asian" with [3,12], and "Other" with [5,6,10]. If it's blank, I'd like it to remain blank.

I'd like the output to be:

Weight   Age     Race
56       10      White
190      54      Hispanic
99       14      Hispanic
80       9       Black 
200      19      Asian
201      20      Asian
180      90      White
17       2       Other
100      10      Black
110      11      
109      9       Other 

Aucun commentaire:

Enregistrer un commentaire