vendredi 10 mai 2019

Create a new binary variable based on whether previous variable is in a vector (R)

I'm have a list of responses to a question, and I need to create a new variable based on these responses categorising them into two categories. This is done based on which of two lists the responses appear in: one has all responses meant to be recoded into one category (ie the new variable having a value of 0), and the other contains all responses to be recoded in the new variable as 1.

I've tried to get this to work using a for loop which cycles through every row, tests the response variable, and assigns a new value to the new variable based on which list the response is in, but when I run it this assigns every row a value of 1 for the new variable, regardless of the old variable.

A reproducible example:

df <- data.frame(state = state.name)
# create the reference lists
AtoM <- df$state[1:26]
NtoZ <- df$state[27:50]

for (i in seq_along(df$state)) {
  if (df$state[i] %in% AtoM) {
    df$state.bin <- 0
  } else if (df$state[i] %in% NtoZ) {
    df$state.bin <- 1
  } else {
    df$state.bin <- NA
  }
}
View(df) # when the result is viewed, the new state.bin variable has a value of 1 for every row


It should be that the first 26 states get assigned a value of 0 for the new variable, but they're all assigned 1. But when I test df$state[1] %in% AtoM, it returns TRUE.

What am I doing wrong?

Aucun commentaire:

Enregistrer un commentaire