Below is my table (Python dataframe). I'm trying to create the last column in purple text.
Below is the logic I want to implement:
-
For each unique 'cbsa' value, if the associated 'zip' field values are all the same then set 'age_HC01_EST_VC31_2' field equal to 'age_HC01_EST_VC31' field (see rows highlighted in yellow).
-
For each unique 'cbsa' value, if the associated 'zip' field values are different then set 'age_HC01_EST_VC31_2' field equal to the sum of 'age_HC01_EST_VC31' field values (see rows highlighted in orange).
-
For each unique 'cbsa' value, if the associated 'zip' field values are some the same and some different, then set 'age_HC01_EST_VC31_2' field equal to the sum of UNIQUE 'age_HC01_EST_VC31' field values (see rows highlighted in blue).
I have tried using groupby and then sum on 'cbsa' field ... but it doesn't work for the specific, multi-layered logic I'm trying to implement.
Any help is greatly appreciated!
Aucun commentaire:
Enregistrer un commentaire