I have written a set of if statements into a FOR loop, however the loop takes in excess of 10mins to run and Ihave been attempting to speed this up after reading an article describing how to to adapt the IFELSE in place of the FOR loop.
The head of the data set is as such:
Destination.City.Name Booking.ID Creation.Date Cancellation.Date Arrival.Date Status.Name Nights Room.nights DI.flag Star.rating
1 Abu Dhabi 14418661 2015-02-16 2015-02-16 2015-04-15 Cancelled 90 90 N 4
2 Abu Dhabi 14418661 2015-02-16 2015-02-16 2015-04-14 Cancelled 90 90 N 4
3 Abu Dhabi 14418661 2015-02-16 2015-02-16 2015-04-06 Cancelled 90 90 N 4
4 Abu Dhabi 14418661 2015-02-16 2015-02-16 2015-04-02 Cancelled 90 90 N 4
5 Abu Dhabi 14418661 2015-02-16 2015-02-16 2015-03-29 Cancelled 90 90 N 4
6 Abu Dhabi 9634541 2013-06-11 2013-06-13 2013-09-13 Cancelled 90 90 N 5
Future.Arrival.Flag Future.Creation.Flag Future.Arrival.Day Status.On.Model.Date
1 1 1 469 NA
2 1 1 468 NA
3 1 1 460 NA
4 1 1 456 NA
5 1 1 452 NA
6 NA NA NA NA
The FOR loop essentially populates the last column Status.On.Model.Date based on the simple logic:
If the Creation Date is after the Model Date, it's NA.
If the Cancellation Date is NA, it is confirmed.
If the Cancellation Date is >= Model Date, it is confirmed, else it is Cancelled.
The original FOR loop is as below and as mentioned, when executed, it works but takes in excess of 10mins (the data set is 600K+ rows):
i = 1
for (i in 1:length(bookingdata$Status.On.Model.Date)) {
if (bookingdata$Creation.Date[i] > Model.Date){
bookingdata$Status.On.Model.Date[i] = NA
} else {
if (is.na(bookingdata$Cancellation.Date[i])) { #
bookingdata$Status.On.Model.Date[i] = 'Confirmed'
} else {
if (bookingdata$Cancellation.Date[i] >= Model.Date){
bookingdata$Status.On.Model.Date[i] = 'Confirmed'
} else {
if (bookingdata$Cancellation.Date[i] < Model.Date){
bookingdata$Status.On.Model.Date[i] = 'Cancelled'
}
}
}
}
}
The new IFELSE code I wrote in place of this is below:
bookingdata$Status.On.Model.Date = ifelse(bookingdata$Creation.Date > Model.Date, NA,
ifelse(is.na(bookingdata$Cancellation.Date, 'Confirmed',
ifelse(bookingdata$Cancellation.Date >= Model.Date, 'Confirmed', 'Cancelled'))))
but I am also getting the error:
Error in is.na(bookingdata$Cancellation.Date, "Confirmed", ifelse(bookingdata$Cancellation.Date >= :
3 arguments passed to 'is.na' which requires 1
I'm not sure how to correct the error as I don't know how else the statements can be realigned.
Thanks!
Aucun commentaire:
Enregistrer un commentaire