jeudi 26 août 2021

Constructuing a new column with multi-condition in pandas & numpy

I have been trying for hours to do simple multi-condition in pandas but I am facing errors and cannot achieve what I want although I feel it is pretty simple!

I have this df:

df1 = pd.DataFrame({'name':['Sara',  'John', 'Christine'],

                   'country': ['US', 'UK', 'CA'],
                   'Age': [10,20,40]})

df1

looks like:

    name        country     Age
0   Sara             US     10
1   John             UK     20
2   Christine        CA     40

I want to achieve multi-condition like if the Age is > 10 and country is in allowed list of countries a result will appear in a new column.

What I did:

I tried to use np.where but I got an error The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I also tried to use np.logical_and , the same error was thrown, tried to use apply & lambda but also no success.

allowed_countries = ['UK','India','Germany']


conditions  = [np.logical_and(df1['Age'] >= 10 , df1['country'] in allowed_countries), np.logical_and((df1['Age'] >= 10), (df1['country'] == 'CA'))]
choices     = [ "Allowed", 'Partially allowed']


df1['admission'] = np.select(conditions, choices, default=np.nan)

My original df has 50K rows so I am looking for the most efficient way. Thanks

Aucun commentaire:

Enregistrer un commentaire