I have two dataframes: the first "popular_title_words" with only 1 column, which contains words. The second with a lot of features and column "title". I need to check each row in my second dataset whether title contains words from the first dataset or not and create new column with 0 and 1.
c=complete[['title', 'id']]
ad=advisement[['title', 'id']]
full1=pd.concat([c,ad])
full1=full1.reset_index()
full1['title'] = full1['title'].astype(str)
full1.dtypes
full1['title'] = full1['title'].str.lower()
full1['title'] = full1['title'].str.replace('[^\w\s]','')
full1['title'] = full1['title'].str.replace('\d+', '')
num_rows, num_feature = full1.shape
popular_title_words='D:/_препроцессинг_текстов/ru_test_freq_words.xlsx'
popular_title_words=pd.read_excel(popular_title_words)
This I used to create datasets and then I tried
if popular_title_words['Words'] in full1['title']:
full1['title_popularity'] = 1
else:
full1['title_popularity'] = 0
but it doesn't work
Help, please
Aucun commentaire:
Enregistrer un commentaire