mercredi 12 février 2020

Drop all rows if there is at least one specific value in column

I am trying to write a code in Python to drop all observations of a certain id if there is at least one specific value in the column worked. Think of it as if you want to know which employee was never absent during the year, so he/she will get a bonus for showing up every single day. Then, it is the same if someone was absent 1 or 50 days, because that person does not have perfect presentism that year.

Let's say the df looks like this (df):

  id worked
1 A  yes
2 A  no
3 B  yes
4 B  yes
5 C  no
6 C  no
7 D  yes
8 D  yes

The ideal new df should look like this (df2):

  id worked
3 B  yes
4 B  yes
7 D  yes
8 D  yes
df2 = df1[df1.worked == 'yes']

doesn't do the job because it will remove id C but it will still show 1 line of id A who did not show up at least 1 day.

I want to make sure that if I do

df2.id.unique()

only B and D get a bonus, instead of A, B and D.

Just to make it clear, I need to get df2 and not the list of unique(). That was just to make an example of the possible uses of df2.

Aucun commentaire:

Enregistrer un commentaire