lundi 27 février 2017

python pandas apply if statement and groupby only for one category

Piggy backing of this question python pandas flag if more than one unique row per value in column

I want to apply the following rule only to rows with Type X.

df['Test_flag'] = np.where(df.groupby('Category').Code.transform('nunique') > 1, 'T', '')

Dataframe df:

    Code      |  Type  | Category  |    Count
    code1          Y        A          89734
    code1          Y        A          239487
    code2          Z        B          298787
    code3          Z        B          87980
    code4          Y        C          098454
    code5          X        D          298787
    code6          X        D          87980

Expected result:

    Code      |  Type  | Category  |    Count  | Test Flag
    code1          Y        A          89734
    code1          Y        A          239487
    code2          Z        B          298787
    code3          Z        B          87980
    code4          Y        C          098454
    code5          X        D          298787       T
    code6          X        D          87980        T

I tried this

  df['Test_flag'] = np.where((df['Type'] == 'X') &df.groupby('Category').Code.transform('nunique') > 1, 'T', '')

and I get the following error:

ValueError: operands could not be broadcast together with shapes (1,2199) (7620,)

Aucun commentaire:

Enregistrer un commentaire