lundi 29 février 2016

python defining with Pandas for DataFrame building

I have a sample inputfile.txt:

chr1    34870071    34899867    pi-Fam168b.1    -
chr11   98724946    98764609    pi-Wipf2.1  +
chr11   105898192   105920636   pi-Dcaf7.1  +
chr11   120486441   120495268   pi-Mafg.1   -
chr12   3891106 3914443 pi-Dnmt3a.1 +
chr12   82815946    82882157    pi-Map3k9.1 -

Column1=chromosome_number

Column2=start

Column3=end

Column4=gene

Column5=strand (either + or -)

This is the code that I have:

import pandas as pd
df1=pd.read_csv('filename.txt',names= ['chr','start','stop','gene','strand'], delimiter=r'\s+')
count =0
c = 0
for i in df1.index:
    for y in df1.index:
        if abs(df1.loc[i,"start"]- df1.loc[y,"stop"]) < 201:
            if i != y:
                index 
                c +=1
print(c)

I keep on getting this error that the index is not defined:

File "./filename.py", line 25, in <module>
index 
NameError: name 'index' is not defined

I have Python 2.7.10 Any feedback much appreciated. Thanks

Aucun commentaire:

Enregistrer un commentaire