vendredi 7 octobre 2016

Python Loop that gets html tags returning empty list instead of tags

So I'm trying to make a function that will go through a list of html tags in a list as characters and return the tags. An example would be it would go through a list like below

['<', 'h', 't', 'm', 'l', '>', '<', 'h', 'e', 'a', 'd', '>', '<', 'm', 'e', 't', 'a', '>']

and return a list like this

[ 'html', 'head', 'meta' ]

However when I run the function below it returns an empty list []

def getTag(htmlList):
    tagList=[]
    for iterate, character in enumerate(htmlList):
        tagAppend = ''
        if character=='<':
            for index, word in enumerate(htmlList):
                if index>iterate:
                    if character=='>':
                        tagList.append(tagAppend)
                        break
                    tagAppend += character

    return tagList

The program seem make sense to me? It creates an empty list (tagList) then it iterates through the list(htmlList) like the first list I posted.

When iterating if it comes across a '<' it then adds all characters above the index where it found the '<' to a string called tagAppend. It then stops when it reaches a '>' which ends the tag. The tagAppend is then added to the tagList. It then clears tagList and redoes to the loop.

Aucun commentaire:

Enregistrer un commentaire