lundi 26 octobre 2015

How to add conditional columns to pandas df

I want to create a column in a dataframe that is conditionally filled with values. Basically my dataframe loks like this

  Origin     X
0 Guatemala  x
1 China      x
2 Kenya      x
3 Venezuela  x
4 Bangladesh x

What I want to do now is create an additional column 'Continent', which adds the continent dependent on the country. My result would look like this:

 Origin      X  Continent
0 Guatemala  x  South america
1 China      x  Asia
2 Kenya      x  Africa
3 Venezuela  x  South america
4 Bangladesh x  Asia

I have tried the following codes to accieve what i want:

def GetContinents(x):
    if x['Origin']== 'Thailand' or 'Indonesia' or 'China' or 'Japan' or 'Bangladesh':
        return 'Asia'
    elif x['Origin']== 'Boliva' or 'Guatemala' or 'Venezuela' or 'Mexico' or 'Argentinia':
        return 'South America'
    elif x['Origin']== 'Guinea Bissau' or 'Egypt' or 'Zaire' or 'Kenya':
        return 'Africa'
    else:
        return 'unknown'

df['Continent']= df.apply(GetContinents, axis=1)

This one fills all the columns in 'continent' with 'Asia' mysteriously.

df['Continent'] = np.where(df['Origin'] == 'Bangladesh', 'Asia', 'unknown')

This one works fine in terms that it fills 'Asia' into the right column and unknown into all others, but when I try to make something like df['Continent'] = np.where(df['Origin'] == 'Bangladesh' or 'China', 'Asia', 'unknown') I get an error.

So basically my question is: how can I fullfill my if condition with different values?

Aucun commentaire:

Enregistrer un commentaire