jeudi 12 septembre 2019

Resetting SQL Window's length when flag is True?

I have a table of utility bill segments, where each segment tells a month's kwh usage. I'd like to sum groups of 12 together to get annual bills.

The variable bill_cd indicates the last segment of the annual bill. Data download

    cust_id     kwh    bill_cd
0   3333    1104.388683 ?
1   3333    1498.007305 ?
2   3333    662.044822  ?
3   3333    661.342412  ?
4   3333    494.070683  ?
5   3333    300.147843  ?
6   3333    836.677007  ?
7   3333    864.608037  ?
8   3333    933.232845  ?
9   3333    1191.025358 ?
10  3333    1507.119588 ?
11  3333    1980.653631 BILL
12  3333    2621.387010 ?
13  3333    2552.053789 BILL

The problem: as can be seen, there aren't 12 segments per annual bill. In this snippet, I'd like to only get the sum of the first annual bill since it will include 12 segments, but not the second because it only has two segments.

I wrote this query in python:

import pandas as pd, pandasql as ps

df = pd.read_csv('so_ex.csv')

q1 = """
select cust_id, kwh, bill_cd,
sum(kwh) over (partition by cust_id
    rows between 11 preceding and current row) as kwh_total,
count(kwh) over (partition by cust_id
    rows between 11 preceding and current row) as bseg_count
from df
"""

ps.sqldf(q1, locals())

And the result is below. Is there a way to make the count variable reset upon hitting another instance of "BILL" so that the bill segment count will go to 1. This way, I could filter the table later where bseg_count=12.

   cust_id      kwh   bill_cd  kwh_total bseg_count
0   3333    1104.388683 ?   1104.388683   1
1   3333    1498.007305 ?   2602.395988   2
2   3333    662.044822  ?   3264.440810   3
3   3333    661.342412  ?   3925.783223   4
4   3333    494.070683  ?   4419.853906   5
5   3333    300.147843  ?   4720.001748   6
6   3333    836.677007  ?   5556.678755   7
7   3333    864.608037  ?   6421.286792   8
8   3333    933.232845  ?   7354.519638   9
9   3333    1191.025358 ?   8545.544996   10
10  3333    1507.119588 ?   10052.664584  11
11  3333    1980.653631 BILL 12033.318215 12
12  3333    2621.387010 ?    13550.316542 12
13  3333    2552.053789 BILL 14604.363026 12

Aucun commentaire:

Enregistrer un commentaire