lundi 29 juin 2020

How do I assign points/score based on word detection in a dataframe?

im new to python and trying to learn word detection. I have a dataframe with words

sharina['transcript']
Out[25]: 
0      thank you for calling my name is Tiffany and we want to let you know this call is recorded...
1                                                Maggie 
2                                  through the time 
3      that you can find I have a question about a claim and our contact is..
4                       three to like even your box box and thank you for your help...

I have created an app that detects words from this:

def search_multiple_strings_in_file(file_name, list_of_strings):
    """Get line from the file along with line numbers, which contains any string from the list"""
    line_number = 0
    list_of_results = []
    # Open the file in read only mode
    with open("sharina.csv", 'r') as read_obj:
        # Read all lines in the file one by one
        for line in read_obj:
            line_number += 1
            # For each line, check if line contains any string from the list of strings
            for string_to_search in list_of_strings:
                if string_to_search in line:
                    # If any string is found in line, then append that line along with line number in list
                    list_of_results.append((string_to_search, line_number, line.rstrip()))
 
    # Return list of tuples containing matched string, line numbers and lines where string is found
    return list_of_results

# search for given strings in the file 'sample.txt'

matched_lines = search_multiple_strings_in_file('sharina.csv', ['recorded','thank'])
 
print('Total Matched lines : ', len(matched_lines))
for elem in matched_lines:
    print('Word = ', elem[0], ' :: Line Number = ', elem[1], ' :: Line = ', elem[2])

I want to assign a score if certain words are detected in the dataframe, for example

if the word 'recorded' has been mentioned = 7 points if the word 'thank' has been mentioned = 5 points

and then the output gives the summation of the total points/score = 12 in this case. How can i do this?

Aucun commentaire:

Enregistrer un commentaire