I had categorical variables, which I converted to dummy variables and got over 2381 variables. I won't be needing that many variables for analysis (say regression or correlation). I want to remove columns if over 90% of the total values in a given column is '0'. Also, is there a good metric to remove columns other than 90% of values being '0' ? Help!
Aucun commentaire:
Enregistrer un commentaire