jeudi 17 septembre 2020

Create new column conditionally from other df columns

I want to create a new boolean column in df1 if it meets both conditions in two columns of df2. For example:

df1:
   ID         Date
01234    8-23-2020
01234    8-26-2020
01235    8-24-2020
01235     9-3-2020
01236     9-1-2020

df2:
   id       visit
01234    8-23-2020
01235     9-3-2020

I want to make 'True' in df1 only the visits in df2, with a result like:

df1:
   ID         Date    In_store
01234    8-23-2020        1
01234    8-26-2020        0
01235    8-24-2020        0
01235     9-3-2020        1
01236     9-1-2020        0

I've tried:

pos_id = df2['id'].tolist()
pos_date = df2['visit'].tolist()

for row in df:
    if df1['ID'].isin(pos_id) and df1['Date'].isin(pos_visit):
        df1['In_store'] = 1
    else: 
        df1['In_store'] = 0

But I get: "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

And I've tried:

for row in df:
    if df1['ID'] == df2['ID'] and df1['Date'] == df2['Date']:
        df1['In_store'] = 1
    else: 
        df1['In_store'] = 0

But i get: "ValueError: Can only compare identically-labeled Series objects" even after renaming the colums the same.

What am I missing? Thanks

Aucun commentaire:

Enregistrer un commentaire