vendredi 2 février 2018

How to update each column based on previous column using for loop

I have data that consists of an id variable and then multiple visit variables that track a persons score over time. I am trying to carry the score forward, updating any subsequent zeros to that score. If there is an NA I would like to leave it (representing no visit) and if a person gets a new score later, I would like the new score to carry forward.

I have included a tiny reproducible example, but my actual data is quite large, so manually updating is not an option. My current attempts are to use a for loop to loop through the visit columns for each person (row). However I am getting this warning:

Error in if ((!is.na(first) & first != 0) & (!is.na(second) & second == : argument is of length zero In addition: Warning message: In is.na(second) : is.na() applied to non-(list or vector) of type 'NULL'

It looks to be because in the environment (Rstuio) first has a value of NA_real_ and second has a value of NULL (empty).

How do I properly define these? I don't have much experience in for loops, so all advice is welcome!

id <- c(101, 102, 103, 104)
visit.1 <- c(0, 21, 0, 21)
visit.2 <- c(0, 0, 50, 0)
visit.3 <- c(0, 0, 0, 44)
visit.4 <- c(NA, NA, 0, 0)
dat <- data.frame(id, visit.1, visit.2, visit.3, visit.4)


for(i in 1:nrow(dat)){
  for(j in 2:ncol(dat)){

    first <- dat[i, j]
    second <- dat[i,(j+1)]

    if((!is.na(first) & first != 0) & (!is.na(second) & second == 0)){
      second <- first
      } else {
        second <- second
      }
   }
  }

The original dataset:

id visit.1 visit.2 visit.3 visit.4
1 101       0       0       0      NA
2 102      21       0       0      NA
3 103       0      50       0       0
4 104      21       0      44       0

The desired end result:

id visit.1 visit.2 visit.3 visit.4
1 101       0       0       0      NA
2 102      21      21      21      NA
3 103       0      50      50      50
4 104      21      21      44      44

Aucun commentaire:

Enregistrer un commentaire