samedi 18 juin 2016

Looping through list of data frames in SparkR

I have a list of data frames (x1,x2,x3,....) in sparkR object, and I need to apply the loop to change the string value "UNKNOWN" and put in the a kind of missing value "NA". Also, my column "OCUPACAO" is composed of string and numeric values like the following example:

OCUPACAO<-c("UNKNOWN","YYY 23443", "YYY 98877447", "YYY 77988907")

When I apply the loop:

x<-list(x1,x2,x3)
for (j in 1:length(x)){
  x[[j]]$OCUPACAO<-ifelse(x[[j]]$OCUPACAO =="UNKNOWN", NA,x[[j]]$$OCUPACAO)
  x[[j]]$$OCUPACAO<-substr(x[[j]]$$OCUPACAO, 5,11)

}

no error is generated, but anything happens. The "ifelse" and "subsr" statement result in nothing, the data frame remains the same:

OCUPACAO<-c("UNKNOWN","YYY 23443", "YYY 98877447", "YYY 77988907")

Aucun commentaire:

Enregistrer un commentaire