mercredi 2 septembre 2020

How to paste MIN and MAX numeric values from a column that contains letters

I have a df with columns containing special signs, words and numbers. I would like to recode the C1 variable so that whenever there is a '?' or a word I would like it to be kept as it is, but regarding the numbers, I would like a BETWEEN and AND to be pasted along with the min and max values of the column.

The problem with the approach I am trying below, is that the max() and min() functions read the letters and specialsigns as min and max values so that I receive a '?' and 'w' instead of '1' and '5'

df <- data.frame("C1" = c("1 - info", "2", "word", "4", "5 - info", "?"), "C2" = c("P", "P", "F", "P", "F", "P"), stringsAsFactors = F)


df$C1 <- ifelse(df$C1 =='?', '?', 
                   ifelse(grepl('word',df$C1)==F & df$C1 !='?' ,paste('BETWEEN', min(substr(df$C1,1,1)), 'AND', max(substr(df$C1,1,1))), df$C1))

How can I paste the highest and lowest numeric values in the columns without writing them directly into the statement? I have a lot of different values in the columns and the intervals differs.

The expected output is


df_exp <- data.frame("C1" = c("BETWEEN 1 AND 5", "BETWEEN 1 AND 5", "word", "BETWEEN 1 AND 5", "BETWEEN 1 AND 5", "?"), "C2" = c("P", "P", "F", "P", "F", "P"), stringsAsFactors = F)

Aucun commentaire:

Enregistrer un commentaire