I'm trying to create a new column called "nintydayinterval" in my data set z. I have another column z$Date that has a numeric date (in excels format). I'm trying to create a variable based on approximately 90 day intervals.
if z$Date < 42460, then z$nintydayinterval == 0, if z$Date < 42550, then z$nintydayinterval == 1, and this keeps going up to 9.
I've tried several methods nothing works, My current form never completes. Any idea how to get this to run smoothly. I was using Lubridate package but it wont work with a data set this large
Note: I have 17Million lines of data I'm running through this, so efficiency is important. I've done similiar if statements and had no problems, but this one I cannot figure out.
z$nintydayinterval <- NA
x<- z$nintydayinterval
y<- z$Date
n<- 17007029
for(i in 1:n)
if (y[i] < 42460) {
x[y] <- 0
} else if (y[i] < 42550) {
x[y]<-1
} else if (y[i] < 42640) {
x[y]<-2
} else if (y[i] < 42730) {
x[y]<-3
} else if (y[i] < 42820) {
x[y]<-4
} else if (y[i] < 42910) {
x[y]<-5
} else if (y[i] < 43000) {
x[y]<-6
} else if (y[i] < 43090) {
x[y]<-7
} else if (y[i] < 43180) {
x[y]<-8
} else {
x[y]<-9
}
> str(z)
'data.frame': 17007029 obs. of 4 variables:
$ Search : Factor w/ 109505 levels "5c4feef",..: 1 1 1 1 1 1 1 1 ...
$ Event : Factor w/ 85 levels "Arcnet","Boot",..: 2 22 6 6 6 6 22 22
$ Date : chr "42961" "42961" "42735" "42735" ...
$ nintydayinterval: logi NA NA NA NA NA NA ...
>
Aucun commentaire:
Enregistrer un commentaire