mercredi 5 septembre 2018

python: splitting test and training data with if statement

I am trying to split a .csv into 2 lists, one is training data and the other test data. I have the condition that if the data has 36 columns (complete) it is training data. Otherwise it is test data, the final column is missing which is what I made predictions on - dependent variable.

I have written:

def training_test_split(self, data):
   train_list=[]
   test_list=[]
   for i in data:
       if len(i[0])==36: #I mean if the number of columns in the ith row = 36
          train_list.append(data)
       else:
          test_list.append(data)
   return [train_list, test_list]

So I put one row of data as filling the condition for test_list and the rest filling the condition for train_list. But they all go into train_list when I call this function :/ I don't want to use pandas. Sorry. Any insight would be valued!

Aucun commentaire:

Enregistrer un commentaire