vendredi 20 mars 2020

check if words from list are in column of my dataframe

I have two dataframes: the first "popular_title_words" with only 1 column, which contains words. The second with a lot of features and column "title". I need to check each row in my second dataset whether title contains words from the first dataset or not and create new column with 0 and 1.

c=complete[['title', 'id']]
ad=advisement[['title', 'id']]
full1=pd.concat([c,ad])
full1=full1.reset_index()
full1['title'] = full1['title'].astype(str) 
full1.dtypes
full1['title'] = full1['title'].str.lower()
full1['title'] = full1['title'].str.replace('[^\w\s]','')
full1['title'] = full1['title'].str.replace('\d+', '')
num_rows, num_feature = full1.shape

popular_title_words='D:/_препроцессинг_текстов/ru_test_freq_words.xlsx'

popular_title_words=pd.read_excel(popular_title_words)

This I used to create datasets and then I tried

if popular_title_words['Words'] in full1['title']:
    full1['title_popularity'] = 1
else:
    full1['title_popularity'] = 0

but it doesn't work

Help, please

Aucun commentaire:

Enregistrer un commentaire