jeudi 10 novembre 2016

Confused about if statement and for loop in R

So I have a Data frame in R where one column is a variable of a few factors and I want to create a handful of dummy variables for each factor but when I write a loop to do this I get an error.

So for example if the column is made up of various factors a, b, c and I want to code a dummy variable of 1 or 0 for each one, the code I have to create one is:

h = rep(0, nrow(data))
for (i in 1:nrow(data)) {
  if (data[,1] == "a") {
    h[i] = 1
  } else {
    h[i] = 0
  }
}
cbind(data, h)

This gives me the error message "the condition has length > 1 and only the first element will be used" I have seen in other places on this site saying I should try and write my own function to solve problems and avoid for loops and I don't really understand a) how to solve this by writing a function (at least immediately) b)the benefit of doing this as a function rather than with loops.

Also I ended up using the ifelse statement to create each vector and then cbind to add it to the data frame but an explanation would really be appreciated.

Aucun commentaire:

Enregistrer un commentaire