I have a my data from an agencies ambulance responses. I have created clusters for locations they respond from and created time bins to divide the time into manageable bins. I then grouped the data by cluster and then by bin and have the number of incident per cluster per bin aggregated into a column as well. I need to fill in zeros for all the time bins where no incidents occurred in the incident count column. I have tried a nested for loop with an if else to make it work. It runs too slowly and I am trying to find a way to switch to a nested list comprehension with the if else statements.
count_values = ers_bin_groupby['no_of_incidents'].values
vals = ers_unique
"ers_unique" is a list of all the unique time bins for each cluster
def fill_missing(count_values,vals):
smoothed_regions=[]
ind=0 # ind iterates over count_values only
for p in range(0,posts):
smoothed_bins=[]
for i in range(max(minute_bin_create_times)):
if i in vals[p]:
smoothed_bins.append(count_values[ind])
ind+=1
else:
smoothed_bins.append(0)
smoothed_regions.extend(smoothed_bins)
print(p)
return smoothed_regions
This is my attempt at a list comprehension with if statement
def fill_missing2(count_values, vals):
smoothed_regions = []
ind = 0 #ind iterates over count_values only
smoothed_regions =[[count_values[ind] ind+=1 if i in vals[p] else 0
for i in range(max(minute_bin_create_times))]
for p in range(0,posts)]
I can't figure out if I still need the "ind+=1" to make it progress through the count_values
Here is an example of the groupby data I am working with there are 20 posts and over 500,000 time bins
post_area time_bin
0 7 1
59 1
104 1
113 1
117 1
147 1
249 1
255 1
This is an example of the ers_unique list [[7, 59, 104, 113, 117, 147, 249, 255, 277, 283, 292, 310, 312, 358, 393, 406, 480, 537, 550, 553, 622,
Aucun commentaire:
Enregistrer un commentaire