dimanche 5 mai 2019

Conditionally overwrite values in series using a for loop and if statement

I ran a Logit model using stats.models and declared a series with predicted values:

M1 = sm.Logit(y_train, X_train)
M1_results = M1.fit()
y_pred = M1_results.predict(X_train)  # This returns a series

y_pred is a series with values between 0 and 1. I want to overwrite its values conditionally by comparing them to an arbitrary cutoff.

Basically, if the i-th element of M1_pred <= 0.7, overwrite with 0. Otherwise, overwrite with 1.

I tried combining a for and an if loop together:

for i in y_pred:
    if i <= 0.7:
        i = 0
    else:
        i = 1

How come this didn't overwrite any of the values in y_pred?

I had to resort to slicing (as suggested here):

y_pred[y_pred <= 0.7] = 0
y_pred[y_pred >  0.7] = 1

This will be inconvenient when I move onto multiclass models. How can I achieve the same result using for and if notation?

PS: Excuse my ignorance. I recently moved from R to Python and everything is really confusing.

Aucun commentaire:

Enregistrer un commentaire