mardi 29 octobre 2019

Check if element is in list, then write to new column in Pandas dataframe if conditions met

Looking at a pandas dataframe containing information on all olympic athletes for past 150 years (Name, Weight, Country, Sport, etc). Available at https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results#athlete_events.csv.

Preview of dataframe

Attempting to make a for loop that iterates through df rows, checks the value stored in the 'Sport' column against several lists and then adds a column to the df with a parent category within the same row. Code so far:

aquatic_sports = ['Swimming','Diving','Synchronized Swimming','Water Polo']
track_sports = ['Athletics','Modern Pentathlon','Triathlon','Biathlon','Cycling']
team_sports = ['Softball','Basketball','Volleyball','Beach Volleyball','Handball','Rugby','Lacrosse']
gymnastic_sports = ['Gymnastics','Rhytmic Gymnastics','Trampolining']
fitness_sports = ['Weightlifting']
combat_sports = ['Boxing','Judo','Wrestling','Taekwondo']
winter_sports = ['Short Track Speed Skating','Ski Jumping','Cross Country Skiing','Luge','Bobsleigh','Alpine Skiing','Curling','Snowboarding','Ice Hocky','Hockey','Speed Skating']

for index, row in df.iterrows():

    if df.iloc[0,11] in aquatic_sports:

        df['Sport Category'] = 'Aquatic Sport'

    elif df.iloc[0,11] in track_sports:

        df['Sport Category'] = 'Track Sport'

    elif df.iloc[0,11] in gymnastic_sports:

        df['Sport Category'] = 'Gymnastic Sport'

    elif df.iloc[0,11] in fitness_sports:

        df['Sport Category'] = 'Fitness Sport'

    elif df.iloc[0,11] in combat_sports:

        df['Sport Category'] = 'Combat Sport'

    elif df.iloc[0,11] in winter_sports:

        df['Sport Category'] = 'Winter Sport'

No errors thrown but unfortunately all values in the new column are the same. Unsure how to pass the current index to ensure each iterations returns a unique, correct value.

Aucun commentaire:

Enregistrer un commentaire