mardi 24 septembre 2019

Python script to perform filtering the dates using if condition

I have three conditions

A (All) = Accepted(Activated) as of August 31st
N (new) = Accepted in last three months (Jun, July, Aug)
R  (returning) = Accepted before June, but logged in last three months i.e.  (Jun, July, Aug)
D = Dropping 

Dropping = A - N - R

The data looks like this - https://pastebin.com/Ybu9KWqk

I want to filter the data into N and R category and store into csv and get the value of D.

I have written this logic for that.

df = pd.read_csv("work_hrs.csv")

from datetime import datetime
import pdb
new_col = []
threshold_act_date = datetime.strptime("2019-6-01", '%Y-%m-%d').date()
threshold_log_date = datetime.strptime("2019-8-21", '%Y-%m-%d').date()

for row in df.iloc[:,[2,3]].values:
    try:
        last_log = datetime.strptime(row[0][:10], '%Y-%m-%d').date()
        active_in = datetime.strptime(row[1][:10], '%Y-%m-%d').date()

        if last_log >= threshold_log_date:
            if active_in >= threshold_act_date:
                new_col.append("N")
            else:
                new_col.append("R")
        else:
            new_col.append("not_active")
    except:
        new_col.append("not_active")

total_came = len(df) - Counter(df["status"])["unregistered"] - Counter(df["status"])["notregister"] - Counter(df["status"])["nan"]

print("Total come to our platform so far :", total_came)

dropout = total_came - Counter(df["new_stat"])["N"] - Counter(df["new_stat"])["R"]

print("The total no. of dropouts are :",dropout)

Is above code correctly filters dates into N and R category?

Aucun commentaire:

Enregistrer un commentaire