I have a pandas dataframe that contains a column of several thousands of comments. I would like to iterate through every row in the column, check to see if the comment contains any word found in a list of words I've created, and if the comment contains a word from my list I want to label it as such in a separate column. This is what I have so far in my code:
retirement_words_list = ['match','matching','401k','retirement','retire','rsu','rrsp']
def word_checker(row):
for sentence in df['comments']:
if any(word in re.findall(r'\w+', sentence.lower()) for word in retirement_words_list):
return '401k/Retirement'
else:
return 'Other'
df['topic'] = df.apply(word_checker,axis=1)
The code is labeling every single comment in my dataframe as 'Other' even though I have double-checked that many comments contain one or several of the words from my list. Any ideas for how I may correct my code? I'd greatly appreciate your help.
Aucun commentaire:
Enregistrer un commentaire