vendredi 10 mai 2019

Create new column categorizing words from another column using ifelse and regex?

I need to create a new column/variable containing only two possible values from a column of strings with many, many possible values (i.e., country names like "USA", "USA and Ecuador", and "Switzerland", etc.)

I've tried using ifelse plus some regex to make a new column which cateogizes countries as either "WEIRD" or "NonWEIRD". The syntax runs, but it makes all values "NonWEIRD" (i.e., the ifelse fn. is not finding any true results).

dataset$WEIRD<-ifelse(PhilCogCOR1$CITIZEN==".*Austria.*" |
                        PhilCogCOR1$CITIZEN==".*Belgium.*" |
                        PhilCogCOR1$CITIZEN==".*Canada.*" |
                        PhilCogCOR1$CITIZEN==".*Chile.*" |
                        PhilCogCOR1$CITIZEN==".*Czech Republic.*" |
                        PhilCogCOR1$CITIZEN==".*France.*" |
                        PhilCogCOR1$CITIZEN==".*Germany.*" |
                        PhilCogCOR1$CITIZEN==".*Hungary.*" |
                        PhilCogCOR1$CITIZEN==".*New Zealand.*" |
                        PhilCogCOR1$CITIZEN==".*Poland.*" |
                        PhilCogCOR1$CITIZEN==".*Portugal.*" |
                        PhilCogCOR1$CITIZEN==".*Spain.*" |
                        PhilCogCOR1$CITIZEN==".*Sweden.*" |
                        PhilCogCOR1$CITIZEN==".*Switzerland.*" | 
                        PhilCogCOR1$CITIZEN==".*Netherlands.*" |
                        PhilCogCOR1$CITIZEN==".*United Kingdom.*" |
                        PhilCogCOR1$CITIZEN==".*USA.*", 
                      "WEIRD", 
                      "NonWEIRD")

If this coded were working as intended, I would get a column of mostly "WEIRD" values and some "NonWEIRD" values.

Aucun commentaire:

Enregistrer un commentaire