vendredi 21 septembre 2018

Is there more efficient way than this if statement for large data in python

So I am dealing with a large data file which has 1.3 million rows. What I'm trying to do is simple, I want to change values in some columns given some conditions.

for i in range(0,len(data2)):    #where len(data2) is about 1.3 million
if data2.loc[i,'PPA']==0:
    data1.loc[i,'LDU']=0        #(data1 and data2 have same amount of rows)

and I will also need to format for some other columns. for example:

for i in range(0,len(data1)):
if data1.loc[i,'Gender']=='F':
    data1.loc[i,'Gender']=0;
else:
    data1.loc[i,'Gender']=1

running one of them takes more than 10 hours in python for my big data. So I'm just wondering if there is any other more efficient way to let it read and rewrite faster? or it's just normal time for processing 1.3 million rows?....

Any help would be appreciated! :)

Aucun commentaire:

Enregistrer un commentaire