I have a data frame containing consumer email data - fresh and repeat contact emails. I need to find outliers in this data based on certain conditions: condition 1: count1 > 1 and count 2 > 1 condition 2: count1 > 1 and count 2 < 1
I have checked for function definition,syntax in python and accordingly defined a function for outlier classification.
def outlier():
for index, row in df.iterrows():
if([row][count1] > 1 and [row][count2] > 1):
if(df[row][Journey] == df[row][journey_lag]):
df[row][outlier] = Same_Property/Date/Agent/Journey
else:
df[row][outlier] = Same_Property/Date/Agent-Different Journey
elif([row][count1] > 1 and [row][count2] == 1):
if(df[row][Journey] == df[row][journey_lag]):
df[row][outlier] = Same_Property/Date-Different_Agent/Journey
else:
df[row][outlier]=Same_Property/Date_Different_Agent/Journey
return df
I am expecting to execute this function with dataframe as follows: df.outlier df.apply(outlier)
Error: Not able to get reqd results
Aucun commentaire:
Enregistrer un commentaire