I am currently facing this problem. Analyzing a big dataset (roughly 3 millions observations), I needed to convert a variable from a format to another. Specifically, I had the date of incorporation of several firms, but coming in two formats: YYYY or MM-DD-YYYY, or other possibilities of which the last 4 characters were always relative to the year. What I need is just the year so I developed this code:
library(stringi)
for (i in 1:length(amadeus$Dateofincorporation))
{if(nchar(amadeus$Dateofincorporation[i]) == 4 & !is.na(amadeus$Dateofincorporation[i])) {
amadeus$Dateofincorporation[i] <- amadeus$Dateofincorporation[i]}
else if ( nchar(amadeus$Dateofincorporation[i]) != 4 & !is.na(amadeus$Dateofincorporation[i])) {
amadeus$Dateofincorporation[i] <- stri_sub(amadeus$Dateofincorporation[i],-4,-1)
}
else { amadeus$Dateofincorporation[i] <- amadeus$Dateofincorporation[i]}
}
The code executes for a long time, and then returns as output: "Warning messages: 1: In doTryCatch(return(expr), name, parentenv, handler) : display list redraw incomplete 2: In doTryCatch(return(expr), name, parentenv, handler) : invalid graphics state 3: In doTryCatch(return(expr), name, parentenv, handler) : invalid graphics state 4: In doTryCatch(return(expr), name, parentenv, handler) : display list redraw incomplete 5: In doTryCatch(return(expr), name, parentenv, handler) : invalid graphics state 6: In doTryCatch(return(expr), name, parentenv, handler) : invalid graphics state"
Does anyone has ideas on how to deal with this? Thank you very much. P.S. the vector is currently a character vector, do you think this has an impact?
Aucun commentaire:
Enregistrer un commentaire