jeudi 23 janvier 2020

Error: When building Conditional statements -- The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I have an issue of accessing values in the data.frame when doing conditional statements.

raw_data = {'first_name': ['Jason', 'Molly', 'Marie', 'Kerie', np.nan], 
        'nationality': ['USA', 'USA', 'France', 'UK', 'UK'], 
        'age': [42, 52, 36, 24, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'nationality', 'age'])

  first_name nationality  age
0      Jason         USA   42
1      Molly         USA   52
2      Marie      France   36
3      Kerie          UK   24
4        NaN          UK   70


person_filter = ['Jason', 'Kerie','Marie']

def process_data(df):

    for pf in person_filter:

        df1 = df[['nationality','first_name']].drop_duplicates()
        df1=df[df.first_name==pf][['age']]

        if df1['age'] < 30:

            df1['class'] = 'young'

        else:

            df1['class'] = 'old'        

        print (df1)

        return df1    

print(process_data(df))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I don't know why I cannot access the column of interest by doing df1['age']!

Aucun commentaire:

Enregistrer un commentaire