I'm working on my first project that isn't straight out of a book but I'm having trouble getting a function to work.
The function receives a list of strings and a BeautifulSoup object and attempts to find each word in the soup.text. However, the code seems unable to find any words/strings at all even when I am certain it should be finding them. I checked and confirmed that the function is definitely receiving the list properly and that the URL works and returns what I expect it to when I do something like print(urlSoup).
The relevant code:
def find_words(words_list, urlSoup):
for word in words_list:
words_count = 0
if word.casefold() in urlSoup:
# ideally it should also count the number of times the word shows up with the 'words_count' bit,
# but I have an impression that this also won't work how I want it to.
words_count += 1
print("The word " + word + " was found " + str(words_count) + " times in " + url + ".")
else:
print("The word '" + word + "' was not found in the URL you provided.")
Things I have tried to fix the fact that the IF statement does not activate (presumably because it doesn't find any words/strings from the list in the soup.text) include removing the .casefold() bit, changing soup.text to soup.content and changing the IF statement to something like
if urlSoup.find_all(word):
I also changed the parser for BeautifulSoup to lxml but that didn't work either. At this point I'm a bit stuck and despite looking around a bit on Stack Overflow and in the bs4 documentation I haven't managed to crack this yet. I'm sure the solution is painfully obvious but as a beginner I'm afraid that I need a bit of help here.
I hope that I have provided enough information, please feel free to ask if you need me to explain further.
Aucun commentaire:
Enregistrer un commentaire