I have a df with columns containing special signs, words and numbers. I would like to recode the C1 variable so that whenever there is a '?' or a word I would like it to be kept as it is, but regarding the numbers, I would like a BETWEEN and AND to be pasted along with the min and max values of the column.
The problem with the approach I am trying below, is that the max() and min() functions read the letters and specialsigns as min and max values so that I receive a '?' and 'w' instead of '1' and '5'
df <- data.frame("C1" = c("1 - info", "2", "word", "4", "5 - info", "?"), "C2" = c("P", "P", "F", "P", "F", "P"), stringsAsFactors = F)
df$C1 <- ifelse(df$C1 =='?', '?',
ifelse(grepl('word',df$C1)==F & df$C1 !='?' ,paste('BETWEEN', min(substr(df$C1,1,1)), 'AND', max(substr(df$C1,1,1))), df$C1))
How can I paste the highest and lowest numeric values in the columns without writing them directly into the statement? I have a lot of different values in the columns and the intervals differs.
The expected output is
df_exp <- data.frame("C1" = c("BETWEEN 1 AND 5", "BETWEEN 1 AND 5", "word", "BETWEEN 1 AND 5", "BETWEEN 1 AND 5", "?"), "C2" = c("P", "P", "F", "P", "F", "P"), stringsAsFactors = F)
Aucun commentaire:
Enregistrer un commentaire