jeudi 5 mars 2020

Replace values across multiple varibles in R

I have a dataframe with 82 variables. Many of the variables contain alphabetic letters, which I want to change into a set of numbers. I can do this column-by-column, number-by-number using the code below:

 library(tibble)
 mydf <- tribble(~Var1, ~Var2.a, ~Var3.a, ~Var4.a,
            "A", "b", "b", "d",
            "B", "w", NA, "w",
            "C", "g", "k", "b",
            "D", "k", NA, "j")

 newdf <- mydf %>%
   mutate(Var2.a = ifelse(Var2.a %in% c("m", "p", "w", "h", "n"), 1, Var2.a),           
          Var2.a = ifelse(Var2.a %in% c("k", "b", "g", "j", "f", "d"), 2, Var2.a),
          Var3.a = ifelse(Var3.a %in% c("m", "p", "w", "h", "n"), 1, Var3.a),           
          Var3.a = ifelse(Var3.a %in% c("k", "b", "g", "j", "f", "d"), 2, Var3.a),
          Var4.a = ifelse(Var4.a %in% c("m", "p", "w", "h", "n"), 1, Var4.a),           
          Var4.a = ifelse(Var4.a %in% c("k", "b", "g", "j", "f", "d"), 2, Var4.a))

But this will take a lot of time for the 70+ columns I need to change!

All the variables of interest have a matching letter combination in the variable name (".a" in the example data), so I should be able to use an ifelse statement on these columns using contains(). However I can't work out how to do this!

I have looked at this answer, which I think is getting me close, but I can't work out how to embed an if-statement into it:

 newdf <- mydf %>%
   mutate_at(vars[2:4] = ifelse(vars %in% c("m", "p", "w", "h", "n"), 1, vars)

But I get the error Error in vars[2:4] : object of type 'closure' is not subsettable. I think the brackets are wrong here, and probably also the use of vars!

Aucun commentaire:

Enregistrer un commentaire