I have an issue of accessing values in the data.frame when doing conditional statements.
raw_data = {'first_name': ['Jason', 'Molly', 'Marie', 'Kerie', np.nan],
'nationality': ['USA', 'USA', 'France', 'UK', 'UK'],
'age': [42, 52, 36, 24, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'nationality', 'age'])
first_name nationality age
0 Jason USA 42
1 Molly USA 52
2 Marie France 36
3 Kerie UK 24
4 NaN UK 70
person_filter = ['Jason', 'Kerie','Marie']
def process_data(df):
for pf in person_filter:
df1 = df[['nationality','first_name']].drop_duplicates()
df1=df[df.first_name==pf][['age']]
if df1['age'] < 30:
df1['class'] = 'young'
else:
df1['class'] = 'old'
print (df1)
return df1
print(process_data(df))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I don't know why I cannot access the column of interest by doing df1['age']!
Aucun commentaire:
Enregistrer un commentaire