vendredi 20 décembre 2019

ply yacc detect end of block/loop

I'm doing a ply yacc program. I'm having some problems while detecting end of a block/loop. The problem comes because in python we have indentation, and I'm having problems with that.

Here's what I test :

print(33)
a=5
while a<3:
    a=2
b=3

Here's what I got :

Program
|  print
|  |  '33'
|  =
|  |  'a'
|  |  '5'
|  while
|  |  < (2)
|  |  |  'a'
|  |  |  '3'
|  |  Program
|  |  |  =
|  |  |  |  'a'
|  |  |  |  '2'
|  |  |  =
|  |  |  |  'b'
|  |  |  |  '3'

And we clearly see that the last line (b=3) isn't at the same level as my while statement. So my problem is how can I detect the end of the loop/block?

Here's my lex :

    reserved_words = (
    'if',
    'print',
    'range',
    'for',
    'in',
    'while'
)

tokens = (
    'COMPARATOR',
    'IDENTIFIER',
    'ILLEGAL',
    'FLOAT',
    'INT',
    'EQU',
    'ENTER',
    'POINTS',
    'TAB'
    ) + tuple(map(lambda s:s.upper(),reserved_words))



literals = '():\s'

def t_ENTER(t):
    r'\n'
    return t
def t_ADD_OP(t):
    r'\+|-'
    return t

def t_POINTS(t):
    r':'
    return t

def t_EQU(t):
    r'\='
    return t    


def t_COMPARATOR(t):
    r'[<>]'
    return t

def t_INT(t):
    r'\b(?<!\.)\d+(?!\.)\b'
    try:
        t.value = t.value   
    except ValueError:
        print ("Line %d: Problem while parsing %s!" % (t.lineno,t.value))
        t.value = 0
    return t

def t_ILLEGAL(t):
    r'\d+[a-zA-z]+'
    try:
        t.value = t.value   
    except ValueError:
        print ("Line %d: Problem while parsing %s!" % (t.lineno,t.value))
        t.value = 0
    return t

def t_FLOAT(t):
    r'\d+\.{1}\d+'
    try:
        t.value = float(t.value)   
    except ValueError:
        print ("Line %d: Problem while parsing %s!" % (t.lineno,t.value))
        t.value = 0.0
    return t

def t_IDENTIFIER(t):
    r'[A-Za-z_]\w*'
    if t.value in reserved_words:
        t.type = t.value.upper()
    return t

def t_TAB(t):
    r'[ \t]{4}'
    return t


def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

And my rules :

def p_programme_statement(p):
    ''' programme : statement  '''
    p[0] = AST.ProgramNode(p[1])

def p_programme_recursive(p):
    ''' programme : statement ENTER programme '''
    p[0] = AST.ProgramNode([p[1]]+p[3].children)

def p_statement(p):
    ''' statement : assignation
                        | structure '''
    p[0] = p[1]

def p_expression_num_or_var(p):
    '''expression : INT
        | FLOAT 
        | IDENTIFIER 
        '''
    p[0] = AST.TokenNode(p[1])

def p_statement_print(p):
    ''' statement : PRINT expression '''
    p[0] = AST.PrintNode(p[2])

def p_expression_comp(p):
    ''' expression : expression COMPARATOR expression'''
    p[0] = AST.OpNode(p[2],[p[1],p[3]])

def p_structure_if(p):
    '''structure : IF expression POINTS ENTER TAB programme '''
    p[0] = AST.IfNode([p[2],p[6]])

def p_structure_while(p):
    ''' structure : WHILE expression POINTS ENTER programme '''
    p[0] = AST.WhileNode([p[2],p[5]])

def p_expression_paren(p):
    '''expression : '(' expression ')' '''
    p[0] = p[2]


def p_assign(p):
    ''' assignation : IDENTIFIER EQU expression '''
    p[0] = AST.AssignNode([AST.TokenNode(p[1]),p[3]])

def p_error(p):
    print ("Syntax error in line %d" % p.lineno)
    yacc.errok()

Tell me if it's not clear. I think my rule structure_if and structure_while is incorrect.. but don't know how to fix

@UPDATE

I'm able to detect end of loop if I use a character like ';' at the end of the line a=2; But in python the character ';' does not exist.. How can I change ?

Aucun commentaire:

Enregistrer un commentaire