Im trying to write a code in R for Volume Weighted Average Price across different levels(depths) of an order book.I want to do it upto level 5 but without hard coding the level(depth) of the book. I am using a data set with about 500,000 rows and 62 variables. I have written code to do exactly what I want but with if statements. The code is following:
BVWAP = function(file, level = 5){
whole_data<- read.csv(file = file,header = FALSE,sep = "",col.names = c("DateTime","Seq","BP1","BQ1","BO1","AP1","AQ1","AO1","BP2","BQ2","BO2","AP2","AQ2","AO2","BP3","BQ3","BO3","AP3","AQ3","AO3","BP4","BQ4","BO4","AP4","AQ4","AO4","BP5","BQ5","BO5","AP5","AQ5","AO5","BP6","BQ6","BO6","AP6","AQ6","AO6","BP7","BQ7","BO7","AP7","AQ7","AO7","BP8","BQ8","BO8","AP8","AQ8","AO8","BP9","BQ9","BO9","AP9","AQ9","AO9","BP10","BQ10","BO10","AP10","AQ10","AO10"))
whole_data<- whole_data[which(whole_data$DateTime != 0),]
whole_data$DateTime= as.POSIXct(whole_data$DateTime/(10^9), origin="1970-01-01") #timestamp conversion
completecase<- whole_data[complete.cases(whole_data),]
attach(completecase)
if(level == 5){
B = data.frame(DateTime=completecase$DateTime, WAP = ((BP1*BQ1)+(BP2*BQ2)+(BP3*BQ3)+(BP4*BQ4)+(BP5*BQ5))/(BQ1+BQ2+BQ3+BQ4+BQ5))
}
if(level == 4){
B = data.frame(DateTime=completecase$DateTime, WAP = ((BP1*BQ1)+(BP2*BQ2)+(BP3*BQ3)+(BP4*BQ4))/(BQ1+BQ2+BQ3+BQ4))
}
if(level == 3){
B = data.frame(DateTime=completecase$DateTime, WAP = ((BP1*BQ1)+(BP2*BQ2)+(BP3*BQ3)+(BP4*BQ4))/(BQ1+BQ2+BQ3))
}
if(level == 2){
B = data.frame(DateTime=completecase$DateTime, WAP = ((BP1*BQ1)+(BP2*BQ2))/(BQ1+BQ2))
}
B
}
Now I know multiple if statements slows it down pretty significantly and that is exactly what I need help with. How do I write this code using a for loop or something in that line? How do I loop it across the columns? What would be a more efficient/faster way to get to where I want? Any kinda help will be greatly appreciated.
Also, since I am working with pretty large data sets what would be best way to read the file with using as little RAM as possible? Because running this code a couple of different times slows down my system quite significantly. Any suggestions on what function I should use to optimize RAM usage?
Let me know if any other information is needed.
Formula for VWAP is as follows:
(Bid*Volume+Bid2*Volume2...Bidn*Volumen)/(Volume1+Volume2...Volumen)
Aucun commentaire:
Enregistrer un commentaire