samedi 29 juin 2019

Problems with indices and ifelse statements when trying to replace a for loop

I would like to replace the following for loop because there has to be an easier way and the for loop takes ages to compute. The data frame yearly consists of several columns. Each column has a length of 15,457 entries. Column Id contains alphanumeric identifiers while all other columns contain numbers and NAs.

yearly[, OA := "NA"]
yearly$OA = as.numeric(as.character(yearly$OA))
for(i = 2:length(yearly$WC03063)) {
  if(yearly$Id[i] == yearly$Id[i - 1]) {
    if(is.na(yearly$WC03063[i])) {
      yearly$OA[i] <-
      yearly$WC02201[i] - yearly$WC02201[i - 1] -
      yearly$WC02001[i] - yearly$WC02001[i - 1] -
      yearly$WC03101[i] - yearly$WC03101[i - 1] +
      yearly$WC03051[i] - yearly$WC03051[i - 1] -
      yearly$WC02999[i]
    } else {
      yearly$OA[i] <-
      yearly$WC02201[i] - yearly$WC02201[i - 1] -
      yearly$WC02001[i] - yearly$WC02001[i - 1] -
      yearly$WC03101[i] - yearly$WC03101[i - 1] +
      yearly$WC03051[i] - yearly$WC03051[i - 1] +
      yearly$WC03063[i] - yearly$WC03063[i - 1] -
      yearly$WC02999[i]    
    }
  }
}

I already made several attempts to solve this problem but always had problems with indices and ifelse statements. My most recent example looks like this:

i <- c(1:length(yearly$WC03063))
yearly[, OA := ifelse(i == 1, NA,
                      ifelse(yearly$Id[i] == yearly$Id[i - 1],
                             ifelse(is.na(yearly$WC0306),
                                    yearly$OA[i] <-
                                    yearly$WC02201[i] - yearly$WC02201[i - 1] -
                                    yearly$WC02001[i] - yearly$WC02001[i - 1] -
                                    yearly$WC03101[i] - yearly$WC03101[i - 1] +
                                    yearly$WC03051[i] - yearly$WC03051[i - 1] -
                                    yearly$WC02999[i],
                                    yearly$OA[i] <-
                                    yearly$WC02201[i] - yearly$WC02201[i - 1] -
                                    yearly$WC02001[i] - yearly$WC02001[i - 1] -
                                    yearly$WC03101[i] - yearly$WC03101[i - 1] +
                                    yearly$WC03051[i] - yearly$WC03051[i - 1] +
                                    yearly$WC03063[i] - yearly$WC03063[i - 1] -
                                    yearly$WC02999[i]),
                       NA))]

R’s error message (in this case) reads as follows:

1: In `==.default`(yearly$Id[i], yearly$Id[i - 1]) : Longer object length is not a multiple of shorter object length
2: In is.na(e1) | is.na(e2) : Longer object length is not a multiple of shorter object length

I already tried several variations of solutions but not a single one has worked so far.

Aucun commentaire:

Enregistrer un commentaire