lundi 13 mai 2019

How do we build a loop/ if statement that meets several conditions at once?

We have a large data set with 150 beer brands, sold in 85 stores, during 399 weeks. The brands are still divided into sub brands (f.ex.: brand = Budweiser, but sub brands still exist: Budweiser light/ Budweiser regular etc.) We want to create a function that creates a new column that gives us the average price per brand if: - the brand is the same, - the week is the same and - the store is the same.

So our goal is to get a column that displays 1 average price per brand per week per store (f.ex.: Budweiser in store 1 in week 1). We struggle to create this if statement/ loop, as we are fairly new to R.

So far we have tried to solve this step, by understanding how it would work without a loop. Therefore, we selected the specific store, brand, and week and created a vector of ones. Like this we could create the vector mean_price which sums up all the prices per week per store of all the sub brands and then divides them by the number of sub brands (calculated by summing up the vector of ones).

try1 <- subset(beer, select = c("brand","week","store","price_ounce","logprice_ounce", "sales_ounce","logsales_ounce"))

try1$vector <- c(1)

store5 <- subset(try1, store==5 & week==224 & brand=="ariel")
mean_price <- (sum(store5$logprice_ounce)/(sum(store5$vector)))
View(mean_price)
``

So far this leads to only one mean price, but we would like to have a column that displays 1 mean price per brand & store & week.
In the end, we need this to perform a regression to estimate price elasticities per store.

We are looking forward to any kind of help as we are completely lost.
Thank you in advance!

Aucun commentaire:

Enregistrer un commentaire