mardi 18 décembre 2018

How to remove columns with more than 90% values as '0' in R

I had categorical variables, which I converted to dummy variables and got over 2381 variables. I won't be needing that many variables for analysis (say regression or correlation). I want to remove columns if over 90% of the total values in a given column is '0'. Also, is there a good metric to remove columns other than 90% of values being '0' ? Help!

Aucun commentaire:

Enregistrer un commentaire