dimanche 25 mars 2018

Python parse tree structure file

I have next branch diagram: enter image description here This diagram sets the rule for hierarchy for headers in text file with headings. The file has approximately the following contents:

1: first row QQ1
2: second BB7
3: Second row is miss. This third row. This is root for 4,5,6 rows WW2
4: Fourth row EE3
5: Fifth row. Sixth row also missed RR4
3: Again the third line TT5
4: Again fourth row YY6
6: Now sixth row in place. But fivth row is miss UU7
7: Seventh row II8
8: eighth row. The string is similar to the third and is the root for 9 and 10 OO9
9: ninth row PP1
10: ten row AA1
8: eighth row DD1
10: ten row. ninth row is miss GG1
11: End of block
1: first row QQ2
2: second row WW2
3: third root row RR4
4: Fourth row OO6
3: Again third row EE3
4: fourth row GG7
6: sixth row FF0
7: Seventh row AA3
11: End of block

I need parse this text like to structure into below:

first file:
QQ1,BB7,WW2,EE3,RR4,null
QQ1,BB7,TT5,YY6,null,UU7
QQ2,WW2,RR4,OO6,null,null
QQ2,WW2,EE3,GG7,null,FF0

second file:
QQ1,BB7,II8,OO9,PP1,null
QQ1,BB7,II8,DD1,null,GG1
QQ2,WW2,AA3,null,null,null

My code:

with open('111.txt', 'r') as f:
    first = []
    second = []
    for line in f:
        if line[0:2].replace(':','')=='1':
            e1=line[-4:]
        elif line[0:2].replace(':','')=='2':
            e2=line[-4:]
        elif line[0:2].replace(':','')=='3':
            e3=line[-4:]
            for line in f:
                if line[0:2].replace(':', '') == '4':
                    e4 = line[-4:]
                elif line[0:2].replace(':', '') == '5':
                    e5 = line[-4:]
                elif line[0:2].replace(':', '') == '6':
                    e6 = line[-4:]
                    first.append('{},{},{},{},{},{}'.format(e1, e2, e3, e4, e5, e6))
                    #how to break into this loop? and add to array 1,2,3,4,5,6 elements? But sixth row may be absent
        elif line[0:2].replace(':','')=='7':
            e7 = line[-4:]
        elif line[0:2].replace(':','')=='8':
            e8 = line[-4:]
            for line in f:
                if line[0:2].replace(':', '') == '9':
                    e9 = line[-4:]
                elif line[0:2].replace(':', '') == '10':
                    e10 = line[-4:]
                    second.append('{},{},{},{},{},{}'.format(e1, e2, e7, e8, e9, e10))

My problem: At the step where I wrote the comment, I have a problem. I do not understand how to get out of the nested loop to the top level and do not move to the next list item.

Aucun commentaire:

Enregistrer un commentaire