mercredi 15 mai 2019

Replace NAs based on values surrounding them

Let's say I have a vector that is full of NA except for every 5th value, which could either be one of two levels:

RNGkind('Mersenne-Twister')
set.seed(42)

x <- NULL
for(i in 1:1000){
  x <- c(x,c(sample(c('Hey', 'Hullo'), 1, rep = F), rep(NA, 4)))
}
x

I want to fill the NAs based on what is surrounding them:

"Hullo" NA NA NA NA "Hey": NAs become "Hey" 
"Hullo" NA NA NA NA "Hullo" NAs become "Hullo"
"Hey" NA NA NA NA "Hullo": NAs become "Hullo"
"Hey" NA NA NA NA "Hey": NAs become "Hey"

I've come up with a for loop that looks at each element iteratively and fills the NAs based on a lot of if statements:

for(i in 1:length(x)){
  if(!is.na(x[i])){
     next
   }else{
    if(x[i-1] == 'Hullo' & x[i+4] == 'Hullo' | x[i-1] == 'Hey' & x[i+4] == 'Hullo'){
      x[i:(i+3)] <- 'Hullo'
    }else{
      x[i:(i+3)] <- 'Hey'
    }
  }
}

But it's a bit of a hacky way of doing it, Plus, it doesn't deal with the tail-end of the vector, where there could be an NA.

Is there:

  1. a more elegant/faster way to do this?
  2. a way to fill up the end of the vector without having to do it manually?

Aucun commentaire:

Enregistrer un commentaire