mercredi 26 août 2020

Checking if any word in a string appears in a list using python

I have a pandas dataframe that contains a column of several thousands of comments. I would like to iterate through every row in the column, check to see if the comment contains any word found in a list of words I've created, and if the comment contains a word from my list I want to label it as such in a separate column. This is what I have so far in my code:

retirement_words_list = ['match','matching','401k','retirement','retire','rsu','rrsp']

def word_checker(row):
    for sentence in df['comments']: 
        if any(word in re.findall(r'\w+', sentence.lower()) for word in retirement_words_list):
            return '401k/Retirement'
        else:
            return 'Other'

df['topic'] = df.apply(word_checker,axis=1)    

The code is labeling every single comment in my dataframe as 'Other' even though I have double-checked that many comments contain one or several of the words from my list. Any ideas for how I may correct my code? I'd greatly appreciate your help.

Aucun commentaire:

Enregistrer un commentaire