1st Problem
I have a regex statement that ran fine as one pattern but since i added in a second pattern and joined them together i now receive a bool type error the second pattern is identical to the first one apart from different variable name. I personally am only starting to build up knowledge with Regex and am probably still in beginner level for identifying issues I just can't seem to understand why the pattern runs as a singular without error and this won't
2nd Problem
I am trying to write an if statement that works by this logic.
Run code to search through a document which is a chat between two people so for each sentence i check if a word from ListA appears in it if it does then check to see if there are any words from ListB also in sentence if yes print if No then move on.
The problem is what it's doing is printing the sentences even if there is only one hit and the then prints other word as None like this:
words matched I and None on Line: I like to sleep in early
i tried multiple attempts such as:
if resultlex is not None and resultcat is not None:
if result.group(1) !=0 and result.group(2) !=0:
Here is my full function where these issues occur:
from collections import Counter
from SpeechActs.Categories import *
from Readfiles import *
# Speech Acts
CategoryGA = GA
CategoryPI = PersonalInfo
# ----------------------
# Word hit counters
CategoryHits = []
LexiconHits = []
# ----------------------
# unsure if used at this point
cleansedLex = []
# ----------------------
# Lists to hold lines where words have been matched
matchedCatlines = []
matchedLexlines = []
TestLine = []
def languagemodel():
WordHit = None
for line in cleanChat.values():
for lword in cleanLex:
for cword in CategoryGA:
for section in line:
if any(lword in section and cword in section for lword in cleanLex for cword in
CategoryGA): # searches each section to find words matching words stored in cleanLex
WordHit = False
patterns = r"\b(" + re.escape(lword) + r")\b", r"\b(" + re.escape(cword) + r")\b" # pattern to match containing Lword
pattern = "|".join(patterns) # joins the above patterns into one
if re.search(pattern, section, re.IGNORECASE): # Running pattern
result = re.search(pattern, section,
re.IGNORECASE) # if match it displays match word with full line
for lword in cleanLex, cword in CategoryGA:
resultlex = result.group(1)
resultcat = result.group(2)
if resultlex is not None and resultcat is not None:
LexiconHits.append(resultlex)
CategoryHits.append(resultcat)
WordHit = True
if section not in TestLine:
TestLine.append(section)
print("words matched %s and %s on Line: %s " % (resultlex, resultcat, section))
elif any(lword in section and cword in section for lword in cleanLex for cword in CategoryPI):
if len(lword) != 0 and len(cword) != 0:
if section not in TestLine:
TestLine.append(section)
# print("words matched %s and %s on Line: %s " % (lword, cword, section))
languagemodel()
Traceback Error:
Traceback (most recent call last):
File "C:/Users/Lewis Collins/PycharmProjects/Test/main.py", line 115, in <module>
languagemodel()
File "C:/Users/Lewis Collins/PycharmProjects/Test/main.py", line 92, in languagemodel
patterns = r"\b(" + re.escape(lword) + r")\b", r"\b(" + re.escape(cword) + r")\b" # pattern to match containing Lword
File "C:\Users\Lewis Collins\AppData\Local\Programs\Python\Python35-32\lib\re.py", line 258, in escape
for c in pattern:
TypeError: 'bool' object is not iterable
All help and advice is appreciated, I've tried to be as clear as possible on what I'm trying to do and achieve and the problems that are occurring, If you feel I'm missing something out in description please tell me.
Thanks
Aucun commentaire:
Enregistrer un commentaire