I have to check the names of all variables in a data.frame and if match found, need to replace the NA values in that variable with Median, else for others replace NAs with mean.
The data.frame cyl_spec has 11 variables and I have to replace NA as below:
- Viscosity: Impute with median
- Wax: Impute with median
- Others: Impute with Mean
I can certainly do it by picking the variables one at a time but I was trying the following code :
attach(cyl_spec)
var <- colnames(cyl_spec)
for(val in var)
{
if(val == 'viscosity'){viscosity[is.na(viscosity == T)] <- median(viscosity, na.rm = T)}
else if(val == 'wax'){wax[is.na(wax == T)] <- median(wax, na.rm = T)}
else {val[is.na(val == T)] <- mean(val, na.rm = T)}
}
detach(cyl_spec)
Somehow the code is not doing anything and I am still getting the same no of NA in the variable using this command :
sum(is.na(cyl_spec$viscosity)
Also, when I run this code I get the following warning message :
Warning messages:
1: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
2: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
3: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
4: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
5: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
6: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
7: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
8: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
9: In mean.default(val, na.rm = T) :
argument is not numeric or logical: returning NA
Could someone please help me with finding the solution for this, am stuck! Thanks in advance!!
Aucun commentaire:
Enregistrer un commentaire