I'm back with another question.
I'm trying to make a loop that allowed me to retrieve tokenized data values in a list, check if there's stop words inside the tokenized cell value and append it to a new list.
# Importing the packages to be used
import xlrd
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
# Declaration of file path of the data and opening of workbook and worksheet
file_path = "C:/Users/L31101/Documents/Data/Copy_1.xlsx"
workbook = xlrd.open_workbook(file_path)
worksheet = workbook.sheet_by_name("ConsolidateModuleQnComment")
# Grabs the numbers of rows and columns of the worksheet
rowcount = worksheet.nrows
columncount = worksheet.ncols
# Prints the number of row and columns
print("\nRow count: %d" % rowcount)
print("Column count: %d" % columncount)
# Grabbing the cell values and placing them inside an array named data_value
data_value = []
for rowindex in range(2, rowcount):
# print("\nCurrent row number: %d" % rowindex)
# print(worksheet.cell_value(rowindex, 6))
data_value.append(worksheet.cell_value(rowindex, 6))
# Grabbing the values inside data_value cell and tokenizes them, and then adds them into the data_tokenized array
data_tokenized = []
for valueindex in range(0, len(data_value)):
data_tokenized.append(word_tokenize(data_value[valueindex]))
# Grabbing the tokenized values from the data_tokenized array and removing the stopwords
stop_words = set(stopwords.words("english"))
data_stopword_removed = []
for tokenizedindex in range(0, len(data_tokenized)):
test_variable = data_tokenized[1]
if test_variable not in stop_words:
data_stopword_removed.append(test_variable)
print("\nNumber of records: %d" % len(data_stopword_removed))
It gives the following error message
C:\Users\L31101\PycharmProjects\Year3\venv\Scripts\python.exe C:/Users/L31101/PycharmProjects/Year3/SentimentAnalysis.py
Row count: 5792
Column count: 7
Traceback (most recent call last):
File "C:/Users/L31101/PycharmProjects/Year3/SentimentAnalysis.py", line 47, in <module>
if test_variable not in stop_words:
TypeError: unhashable type: 'list'
Process finished with exit code 1
I've tried asking friends around my school but none of them could give me an answer regarding this issue. Hence, I'm looking for some help from the community :)
Aucun commentaire:
Enregistrer un commentaire