mercredi 3 novembre 2021

sapply + if - retain column names

even though it is related to sapply - retain column names, I could not find the answer there...

I had a simple function to scale data between 0 and 1 that retained the column names:

scale <-  function(x){apply(x, 2, function(y) ((y)-min(y, na.rm=TRUE))/(max(y, na.rm=TRUE)-min(y, na.rm=TRUE)))}

Now I needed to add an if clause for the case wher max(y) = min(y) and changed the function like so:

scale <- function(x){apply(x, 2, function(y) if(min(y, na.rm=TRUE)==max(y, na.rm=TRUE)) {0.5} else {((y)-min(y, na.rm=TRUE))/(max(y, na.rm=TRUE)-min(y, na.rm=TRUE))})}

Using these functions on an input data frame like so...

as.data.frame(scale(input[sapply(input,is.numeric)]))

produces different column names where the original function preserved the names and the new one modifies them in a way where brackets or hyphens are replaced with dots:

Example column name w/o the IF: INL_Avg(S-B0-ETC-CDS-06C~PM_CD1_D_B0_SI_P0V_B.NM)

Example column name w/ the IF: INL_Avg.S.B0.ETC.CDS.06C.PM_CD1_D_B0_SI_P0V_B.NM.

While I do realized these column names are not ideal it is what I need to use and I would appreciate a hint as to how to avoid this special character replacement (adding USE.NAMES=TRUE to the sapply won't help...).

Thanks, Mark

Aucun commentaire:

Enregistrer un commentaire