mardi 3 août 2021

Dichotomous variable based on conditionals in R

I have a dataframe similar to this one:

date <- as.Date(c('2010-11-1','2010-11-2','2010-11-3','2010-11-4','2010-11-5','2010-11-6','2010-11-7','2010-11-8','2010-11-9','2010-11-10'))
precipitation <- c(0, 11, 12,3,0,0,0,7,9,10)
snowheight <- c(5,7,56,32, 11, 24, 70,8, 13, 11)
temperature <- c(-5, -2, 0, 0.4, -1, 5,6,4, 9, 10)
df <- data.frame(date, precipitation, snowheight, temperature) 

I am trying to create a dichotomous variable with (0 and 1) for each datasample based on the following conditions:

  • if snowheight > 10 we continue with the conditions below. Else assign NA to the dichotomous variable.
  • if precipitation =< 0 we assign 0
  • if precipitation > 0 and temperature > 0 we assign 1.
  • if precipitation > 0 and temperature < 0 we assign 0.

I figured out that it is easy to use ifelse with nested conditions, however, this will not work due to the partially overlapping conditions. The next thing that comes to my mind is to use for-loop and check for every row. This is what I figured out:

for (i in df){
  if (snowheight > 10 && rain > 0){
    if (temperature > 0){
      df$dicht <- 1
    } else {
      df$dicht <- 0
    }
  } else {
    df$dicht <- NA
  }
}

When I run the code like this I get a variable "dicht" filled entirely with "NA". I think see what type of mistake that is but I am not sure how to fix it. It seems that written this way the whole "dicht" variable gets assigned with the value instead of the indexed row. I tried like this:

for (i in df){
  if (snowheight > 10 && rain > 0){
    if (temperature > 0){
      df$dicht [i] <- 1
    } else {
      df$dicht [i] <- 0
    }
  } else {
    df$dicht [i] <- NA
  }
}

However, I get the following error:

Error in $<-.data.frame(*tmp*, "dicht", value = c(NA, NA, NA, NA, : replacement has 14923 rows, data has 10

Any help is appreciated. Thanks in advance.

Aucun commentaire:

Enregistrer un commentaire