I am trying to split a .csv into 2 lists, one is training data and the other test data. I have the condition that if the data has 36 columns (complete) it is training data. Otherwise it is test data, the final column is missing which is what I made predictions on - dependent variable.
I have written:
def training_test_split(self, data):
train_list=[]
test_list=[]
for i in data:
if len(i[0])==36: #I mean if the number of columns in the ith row = 36
train_list.append(data)
else:
test_list.append(data)
return [train_list, test_list]
So I put one row of data as filling the condition for test_list and the rest filling the condition for train_list. But they all go into train_list when I call this function :/ I don't want to use pandas. Sorry. Any insight would be valued!
Aucun commentaire:
Enregistrer un commentaire