vendredi 9 octobre 2020

if statements with multiples conditions and understanding the "or" operator

Good morning ,

I have a question related to the use of the operator "or". I have 8 lists of postive and negative words for 4 dictionnaries :

list_pos_1 & list_neg_1 for lexicon 1
list_pos_2 & list_neg_2 for lexicon 2
list_pos_3 & list_neg_3 for lexicon 3
list_pos_4 & list_neg_4 for lexicon 4

So I look into a list of sentence and for each sentence I checked if the word in the sentence present before or after a connector (I have a list of connector) in the sentence is present in any of the above list;

And then I applied some rules in order to get the polarity of the sentence based in the number of words present in the sentence accord ing to 3 cases :

first case : I took in consideration the positive and negative words found before the connector second : I took in consideration the positive and negative words found after the connector third : I took in consideration the positive and negative words found before and after the connector

pos >/</= neg {connector} pos </>/= neg

For each case and for each rules I should get a list of sentences.

I print the result for each lexicon and for all the lexicon at the same time and also for two dictionnary at the same time (lexicon 1 & 2, lexicon 3 & 4)

I observed the answers are all differents; I explain :

If I check in the lexicon 1 and 2 seperately, I get differents numbers of sentences which I returned based on the polarity but if I combined the search into two dictionnaries at the same time ; I get another result but the result does not correspond at the total of the result I get from lexicon 1 & 2 seperately and I use the operator "or" to search in either "lexicon 1" or "lexicon2" so I was expecting that the result will be the combination of the result of the two dictionnary that I get seperately;

for lexicon 1 : I get 44 sentences wich are negative
for lexicon 2 :  I get 82 sentences which are negative

but if I used "lexicon 1 or lexicon 2 " : I get 91 , I thought I will get 126 ( which is total of what I get for lexicon 1 & 2 )


Maybe it is correct and I did not get the how "or" is functionning, below the script and the answers

# v is the sentence which is already splited , I checked every string/w in v to see if it is # present in my list of positive or negative word of my lexicon and I did it for the 4 lexicons #  and also for the 4 lexicon at the same time
                      
v = [word.lemma_.lower().strip() for word in mytokens if word.pos_ != "PUNCT" and word.pos_ != "SPACE"]
        

        
        for i, j in enumerate(v):
            if j == 'mais' or j == 'pourtant' or j == 'néanmoins' or j == 'cependant' or j == 'toutefois' or j == 'dès' or j == 'bien':
                #print(j , i )
                liste_index_pivot.append(i)
            
                
        #print(liste_index_pivot)
        
        if len(liste_index_pivot)== 0:  
            elts_sans_w_pivot.append(k)
        
        else :
            for w in v:
                ind_pivot = max(liste_index_pivot) # index of the connector I took the index of the connector which is high and discard the others; It is in case I have lots of connectors in the sentenc# words before the index of the connector
                    
                #print(ind_pivot)
                ind = v.index(w)
                if ind < ind_pivot:  # look in all the negative lists of the 4 lexicon

                    if w in liste_neg_F  or w in liste_neg_D or w in liste_neg_A or w in liste_neg_P:
                        d_neg_av_t.append(w)
                    elif w in liste_pos_F or w in liste_pos_D or w in liste_pos_A or w in liste_pos_P:
                        d_pos_av_t.append(w)
                    else:
                        d_0_av_t.append(w)


                    if w in liste_neg_F  or w in liste_neg_D :  # look in two list of two differents lists

                        d_neg_av_fd.append(w)
                    elif w in liste_pos_F  or w in liste_pos_D :
                        d_pos_av_fd.append(w)
                    else:
                        d_0_av_fd.append(w)

                    if w in liste_neg_F :  # look in the list of one particular dictionnary
                        d_neg_av_f.append(w)
                    elif w in liste_pos_F:
                        d_pos_av_f.append(w)
                    else:
                        d_0_av_f.append(w)

                    if w in liste_neg_D :  # look in the list of one particular dictionnary
                        d_neg_av_d.append(w)
                    elif w in liste_pos_D :
                        d_pos_av_d.append(w)
                    else:
                        d_0_av_d.append(w)
                        

                else:
                    None 
                    
                    
        # Collecting the "len" of positive words and negative words to do the rules
        len_d_pos_av_t =len(d_pos_av_t)
        len_d_neg_av_t =len(d_neg_av_t)
        len_d_pos_av_fd =len(d_pos_av_fd)
        len_d_neg_av_fd =len(d_neg_av_fd)
        
        len_d_pos_av_f =len(d_pos_av_f)
        len_d_neg_av_f =len(d_neg_av_f)
        len_d_pos_av_d =len(d_pos_av_d)
        len_d_neg_av_d =len(d_neg_av_d)
        
        # Rules for each dictionnary
                
        if len_d_pos_av_t >= len_d_neg_av_t : 
            if k not in Liste_M_t:
                Liste_M_t.append(k)
        elif len_d_pos_av_t <= len_d_neg_av_t  : 
            if k not in Liste_F_t:
                Liste_F_t.append(k)
        else:
            if k not in Liste_A_t:
                Liste_A_t.append(k)
                
        if len_d_pos_av_fd >= len_d_neg_av_fd : 
            if k not in Liste_M_fd:
                Liste_M_fd.append(k)
        elif len_d_pos_av_fd <= len_d_neg_av_fd  : 
            if k not in Liste_F_fd:
                Liste_F_fd.append(k)
        else:
            if k not in Liste_A_fd:
                Liste_A_fd.append(k)
        
        ########################################
                
        if len_d_pos_av_f >= len_d_neg_av_f : 
            if k not in Liste_M_f:
                Liste_M_f.append(k)
        elif len_d_pos_av_f <= len_d_neg_av_f  : 
            if k not in Liste_F_f:
                Liste_F_f.append(k)
        else:
            if k not in Liste_A_f:
                Liste_A_f.append(k)
                
        ############################################       
        
        if len_d_pos_av_d >= len_d_neg_av_d : 
            if k not in Liste_M_d:
                Liste_M_d.append(k)
        elif len_d_pos_av_d <= len_d_neg_av_d  : 
            if k not in Liste_F_d:
                Liste_F_d.append(k)
        else:
            if k not in Liste_A_d:
                Liste_A_d.append(k)
                
                

Results :

Méthode polarité :   avant

*** Lexicon 1 &  2 *****
Liste F :  91  Liste M :  339  Liste A :  0 


*** All lexicon *****
Liste_F: 9  Liste M:  421  Liste A :  0 


*** Lexicon 1  *****
Liste_F: 44  Liste M:  386  Liste A :  0 


*** Lexicon   2 *****

Liste_F: 82  Liste M:  348  Liste A :  0 

As you can see the result are not really similar particularly

So I was wondering if I was using the "or" operator incorrectly or is the "for", "if" and "elif" which are misplaced.

Aucun commentaire:

Enregistrer un commentaire