jeudi 4 février 2021

Create a column with the mean of a variable meeting a certain condition

I want to create a column in my broad_data dataset that every row of the new column will contain the mean of the visits variable in a specific time window (2-minute window).

The two datasets contain dates and times in the same column. Dates and times are different between the two datasets. So this is a way of merging the two datasets.

For every row of the broad_data$dateandtime I want to take the two-minute window prior (or after) to the specific date and time and for that time window I want measure and store on the new column (in the broad_data) the mean of the traffic_data$visits.

The broad_data contains 6530 obs and the traffic has around 10 mil. obs.

I tried to make it work by using for loops and if conditions but I had no luck. But I am thinking that there is a way using dplyr.

broad_data$v <- as.vector(v) 
fori in 1:nrow(bel_broad$date_time) {
  if (traffic_data$date_time <= broad_data$date_time + 120 ) {
    v[i] <- mean(traffic_data$visits_index)
   else if (traffic_data$date_time >= broad_data$date_time - 120)
  v[i] <- mean(traffic_data$visits_index)
else 
    v[i] <- 0
}
return(broad_data$v) }

Aucun commentaire:

Enregistrer un commentaire