jeudi 8 juillet 2021

Condition if a variables value is the same diffrent years, Python/Pandas. Fastest solution?

I have a large dataset (20 millions rows). The dataset contains information on where a person live year 2018 and 2019. I wish to write a condition that returns True if the variable 'county" has the same value both year 2018 and 2019 and False if the two values differ. what is the most effective way to acheive this?

df=pd.DataFrame({'id': [10, 10, 20, 20, 30, 30, 40, 40], 'year': [2018, 2019, 2018, 2019, 2018, 2019, 2018, 2019],
    'county' : ['1', '1', '4', '2', '3', '3', '1', '3']})

I aim to create a new column that for id 10 is True (stayer) and for id 20 is False (mover)

Aucun commentaire:

Enregistrer un commentaire