I have three conditions
A (All) = Accepted(Activated) as of August 31st
N (new) = Accepted in last three months (Jun, July, Aug)
R (returning) = Accepted before June, but logged in last three months i.e. (Jun, July, Aug)
D = Dropping
Dropping = A - N - R
The data looks like this - https://pastebin.com/Ybu9KWqk
I want to filter the data into N and R category and store into csv and get the value of D.
I have written this logic for that.
df = pd.read_csv("work_hrs.csv")
from datetime import datetime
import pdb
new_col = []
threshold_act_date = datetime.strptime("2019-6-01", '%Y-%m-%d').date()
threshold_log_date = datetime.strptime("2019-8-21", '%Y-%m-%d').date()
for row in df.iloc[:,[2,3]].values:
try:
last_log = datetime.strptime(row[0][:10], '%Y-%m-%d').date()
active_in = datetime.strptime(row[1][:10], '%Y-%m-%d').date()
if last_log >= threshold_log_date:
if active_in >= threshold_act_date:
new_col.append("N")
else:
new_col.append("R")
else:
new_col.append("not_active")
except:
new_col.append("not_active")
total_came = len(df) - Counter(df["status"])["unregistered"] - Counter(df["status"])["notregister"] - Counter(df["status"])["nan"]
print("Total come to our platform so far :", total_came)
dropout = total_came - Counter(df["new_stat"])["N"] - Counter(df["new_stat"])["R"]
print("The total no. of dropouts are :",dropout)
Is above code correctly filters dates into N and R category?
Aucun commentaire:
Enregistrer un commentaire