lundi 1 février 2021

How to create a variable based on specific condition?

I would like to create a variable based on specific condition. I have a text variable description and I tried to create a loop based on this variable. What I'm trying to do is :

If word1 and word2 in description ont the first row, then append classe with word1 and word2 .

If word3 is contained alone for the second row, then append classe with word3 and so on. And if there none of the three word then append with Others.

Here's the loop I did so far :

   container = ("word1", "word2", "word3")
    
    classe = []
    
    for i in range(0, df.shape[0]):
        for n in container:
            if [df['description'].str.contains(n).any()]:
                      classe.append(n)
                     
        else :
           
            classe.append('Other')

And I would like something like that :

outputdesired

But I got this weird result instead, not at all what I want :

classe

['word1',
 'word2'
 'word3',
 'Others',
 'word1',
 'word2'
 'word3',
 'Others',
 'word1',
 'word2'
 'word3',
 'Others',
 ...]

I obtained a shape for classe of 3040 or df.shape[0] is equal to 608, so there is a trouble because I need to have the same. I need a shape of 608 for classe. Something's wrong.

Any idead how to fix that ? Or if you have another solution, I'm also interested :)

Thanks.

Aucun commentaire:

Enregistrer un commentaire