I'm trying to make a new boolean variable by an if-statement with multiple conditions in other variables. But so far my many tries do not even work with one.
I wanna have a variable "Label" which is whether TRUE or FALSE if the variables "AnzZahlung" is equal or lower than 1 & 'DLZ_SCHDATSCHL' is lower then 20% of the Data, there is only a 15% difference between 'Schadenwert' and 'zahlgesbrut' and 4 extra variables are equal to 'N' (Which is string containing False in this dataset).
I sliced the DataFrame before several times which worked pretty well and I could also create the different variables. I tried different methods with Lambda, list comprehension, writing a function whatever. Even if my Code did run quite fine I just recieved no values for my new variable.
I would really appreciate if anyone of you can see the Problem, I already searched for two days the whole World Wide Web. But as beginner I couldn't find the solution yet.
I sliced the DataFrame before several times which worked pretty well and I could also create the different variables. I tried different methods with Lambda, list comprehension, writing a function whatever.
I would really appreciate if anyone of you can see the Problem, I already searched for two days the whole World Wide Web. But as beginner I couldn't find the solution yet.
amount = df4['AnzZahlungIDAD']
time = df4['DLZ_SCHDATSCHL']
Erstr = df4['Schadenwert']
Zahlges = df4['zahlgesbrut']
timequantil = time.quantile(.2)
diff = (Erstr-Zahlges)/Erstr*100
diffrange = [(diff <=15) & (diff >= -15)]
special = df4[['Taxatoreneinsatz', 'Belegpruefereinsatz_rel', 'IntSVKZ', 'ExtTechSVKZ']]
First Method with list comprehension
label = []
label = [True if (amount[i] <= 1) & (time[i] <= timequantil) & (diff == diffrange) & (special == 'N') else False for i in label]
label
Second Method with iterrows()
df4['label'] = pd.Series([])
df4['label'] = [True if (row[amount] <= 1) & (row[time] <= timequantil) & (row[diff] == diffrange) & (row[special] == 'N') else False for row in df4.iterrows()]
df4['label']
3rd Method with Lambda function
df4.loc[:,'label'] = '1'
df4['label'] = df4['label'].apply([lambda c: True if (c[amount] <= 1) & (c[time] <= timequantil) & (c[diff] == diffrange) & (c[special]) == 'N' else False for c in df4['label']], axis = 0)
df4['label'].value_counts()
I expected that I get a varialbe "Label" in my dataframe df4 that is whether True or False.
Fewer tries gave me only all values = False or all = True even if I used only a single Parameter, which is impossible by the data.
First Method runs fine but Outputs: []
Second Method gives me following error: TypeError: tuple indices must be integers or slices, not Series
Third Method does not load at all.
Aucun commentaire:
Enregistrer un commentaire