mardi 19 février 2019

IF condition based on ROLLMEAN and ROLLAPPLY

My real dataset is an xts with 4 columns x 110000 lines with signal output values. What I would like to do is to remove some of those value based on a somewhat arbitrary criteria.
Taking the sample_matrixdataset from xts as an example, my code looks like this:

require(xts)
require(zoo)
data("sample_matrix")
myxts <- as.xts(sample_matrix)

for (colonne in 1:ncol(myxts)) {

  for (i in 2:(nrow(myxts))) {

  if (i < 11) {
    j = i-1
    k = 10
  }else{
    if (i > nrow(myxts)-10){
      j = 10
      k = nrow(myxts)-i
    }else{
      j = 10
      k = 10
    }
  }
  if (myxts[i,colonne] > mean(myxts[i-j:i+k,colonne])+5*sd(myxts[i-j:i+k,colonne])) {
    myxts[i,colonne] <- NA
    myxts<- na.approx(myxts)
}}}

What I'm doing is removing any data that is superior to the mean + 5x standard deviation of the 20 adjacent values. This code runs but it is slow and most likely not optimised.
The 2 if are to avoid calculating themeanand sdwith data subscript out of bond.
I want to reduce the code using rollmeanand rollapply but I have no idea how to do it.

So far this is what I think it should look like:

for (i in 1:nrow(myxts)) {
if (myxts[i,] > rollmean(myxts[i,],k=20)+5*rollapply(myxts[i,],width = 20,FUN =sd)) {
  myxts[i,] <- NA
  myxts<- na.approx(myxts)
}}

But this leads to Error in rollapply.xts(x, k, FUN = (mean), fill = fill, align = align, : width <= nr is not TRUE
I don't know how to make the rollmean "follow" i.

Any help is welcome !

Aucun commentaire:

Enregistrer un commentaire