lundi 2 novembre 2015

`for` loop coercing matrix into large list in R

I have a larger dataset (4352 observations) that I am trying to break down into continuous and discrete data in preparation for Bayesian analysis. So far, I have tried two different methods of doing this: using an if-then statement and if else, both within for loops.

I have my observations as proportions in the object y:

> head(y,10)  
     A   B    C DEF  
1  0.50 0.5 0.00 0.0  
2  0.95 0.0 0.05 0.0  
3  0.10 0.0 0.00 0.9  
4  0.70 0.0 0.30 0.0  
5  0.95 0.0 0.05 0.0  
6  0.60 0.0 0.40 0.0
7  0.95 0.00 0.05 0.0
8  0.95 0.05 0.00 0.0
9  1.00 0.00 0.00 0.0
10 1.00 0.00 0.00 0.0

And a vector of the length of y, which I will later use to index whether a row is discrete (0,1) or continuous.

y.discrete <- rep(0,dim(y)[1])

My first method is the if-then statement:

y.d <- matrix(NA,n,ncat)

for (i in 1:n){
y.d[i,][max(y[i,])==1]=y[i,]
y.discrete[i][!is.na(y.d[i,])]=1
}

the for loop produces Error in y.d[i, 1] : incorrect number of dimensions. If you call out one single element (e.g., y.d[i,1]) in the if-then statement, then it runs without error. Also, once the loop has been run, the object y.d is changed from a matrix to a Large list. I believe this is what is causing the error in the number of dimensions. If you look at i here, it is 1.

I have also tried an if else:

y.d <- matrix(NA,n,4)

for (i in 1:n){
  if (max(y[i,])==1) {
    y.d[i,]<-y[i,]    
  } else {
    if (!is.na(y.d[i,1])) {
      y.discrete[i]<-1
    } 
  }
}

This provides the same error with the loop, but if you look at the last value of i, it is 10. This still has the issue of changing the class, too.

Does anyone have any thoughts on what is happening inside here? I have already asked two colleagues for help, and we are all stumped. I appreciate your help. I am running R 3.0.3 on a Windows 7, 64-bit machine.

Aucun commentaire:

Enregistrer un commentaire