mercredi 23 décembre 2020

Python if-statement does not loop through the elif and else statement parts

I'm trying to open an XML file and parse through it, looking through its tags and finding the text within each specific tag. If the text within the tag matches a string, I want it remove a part of the string or substitute it with something else.

However, it looks like for some reason the code stays inside the third if-statement and thinks that end_int always equals none. I'm not sure why because when finding the value of the variable end_int, I had printed out the values and it gets all the 'end_char' tag values from the xml file, which is what end_int should be. But inside the if statement, it thinks end_char is always None.

The mfn_pn variable is a barcode inputted by the user, something similar to ATL-157-1815, DFW-184-8378., ATL-324-3243., DFW-432-2343, ATL 343 8924, DFW 342 3413, DFW-324 3423 T&R.

The XML file has the following data:

<?xml version="1.0" encoding="utf-8"?>
<metadata>
    <filter>
        <regex>ATL|LAX|DFW</regex >
        <start_char>3</start_char>
        <end_char></end_char>
        <action>remove</action>
    </filter>
    <filter>
        <regex>DFW.+\.$</regex >
        <start_char>3</start_char>
        <end_char>-1</end_char>
        <action>remove</action>
    </filter>
    <filter>
        <regex>\-</regex >
        <replacement></replacement>
        <action>substitute</action>
    </filter>
    <filter>
        <regex>\s</regex >
        <replacement></replacement>
        <action>substitute</action>
    </filter>
    <filter>
        <regex> T&amp;R$</regex >
        <start_char></start_char>
        <end_char>-4</end_char>
        <action>remove</action>
    </filter>
</metadata>

The Python code I'm using is:

import re
from xml.etree.ElementTree import ElementTree

# filters.xml is the file that holds the things to be filtered
tree = ElementTree()
tree.parse("filters.xml")

# Get the data in the XML file 
root = tree.getroot()

# Loop through filters
for x in root.findall('filter'):

    # Find the text inside the regex tag
    regex = x.find('regex').text
    # Find the text inside the start_char tag
    start_prim = x.find('start_char')
    
    # If the element exists assign its text to start variable
    start = start_prim.text if start_prim is not None else None
    start_int = int(start) if start is not None else None
    print('start: ', start_int)

    # Find the text inside the end_char tag
    end_prim = x.find('end_char')

    # If the element exists assign its text to end variable
    end = end_prim.text if end_prim is not None else None
    end_int = int(end) if end is not None else None
    print('end: ', end_int)

    # Find the text inside the action tag
    action = x.find('action').text

    if action == 'remove':
        if re.match(r'%s' % regex, mfn_pn, re.IGNORECASE):
            print('if statement start:', start_int)
            print('if statement end:', end_int)
            if end_int == None:
                print('if statement start_int:', start_int)
                print('if statement end_int:', end_int)
                mfn_pn = mfn_pn[start_int:]
            elif start_int == None:
                print('elif statement start_int:' ,start_int)
                print('elif statement end_int:', end_int)
                mfn_pn = mfn_pn[:end_int]
            else: 
                print('else statement start_int:', start_int)
                print('else statement end_int:', end_int)
                mfn_pn = mfn_pn[start_int:end_int]
    elif action == 'substitute':
        mfn_pn = re.sub(r'%s' % regex, '', mfn_pn)

For the print statements inside the elif and else statements, nothing prints out because for some reason, the code thinks start_int never equals "None" and all the other cases for the else statement don't work either. It thinks that end_int == 'None' is always true and I'm not sure why it would think that because printing out "end_int" outside the if-statements get all the end_char values from the XML file.

Aucun commentaire:

Enregistrer un commentaire