lundi 16 septembre 2019

Replace a lot of if else conditions with scikit-learn (classification problem)

I'm trying to wrap my head around ML with scikit-learn

Here is what I'm trying to do:

import pandas as pd
from sklearn.tree import DecisionTreeClassifier

df = pd.DataFrame({
    "f1": [1, 1],
    "f2": [0, 0],
    "c":  [1, 0]
})


#df
f1 f2 c     # f1, f2 - features / c - class/ classifier
1  1  1     # for f1 = 1 and f2 = 1 > expected c = 1
0  0  0     # for f1 = 0 and f2 = 0 > expected c = 0


dtc_clf = DecisionTreeClassifier()

features = df[["f1", "f2"]]
labels   = df[["c"]]

dtc_clf.fit(features, labels)

test_features = pd.DataFrame({"ft1": [1, 1], 
                              "ft2": [0, 0]})

#test_features
ft1 ft2  #I added for test exactly the training data
1   1
0   0


dtc_clf.predict(test_features)

#I'm getting this result:
#array([0, 0])

#I expected this result
#array([1, 0])



If '1,1 => 1' then '0, 0 => 0' It should be 'array([1, 0])' right?

Each column is a condition which if it's respected will be 1 if not 0. Basically I'm trying to replace a lot of if else conditions with ML.

Aucun commentaire:

Enregistrer un commentaire