mardi 16 juin 2020

I would like to learn how to data analyse, but I am having trouble with conditional statements and extracting data to plot using matplotlib

Basically, I am new to data analysis and I got a dataset that I would like to analyse and get some graphs to test out hypotheses and learn more about the data I got about the olympics.

Now, I would like to find out which age gets the most number of gold, silver and bronze medals and same goes for height.

This is the code I have created, I think it works (i am not sure) but takes like 20 minutes to process and the format is weird which gives me trouble putting in a graph. I would like to know how i can cut the processing time significantly shorter and how I would be able to graph it ->

#calculating number of medals each person has
j=0
i=0
height_gold=[0]*230
height_silver=[0]*230
height_bronze=[0]*230

while(i<271116):
    while(j<230):
        if df.iloc[i,4]==j:
            if df.iloc[i,14]=='Gold':
                height_gold[j]=height_gold[j]+1
            if df.iloc[i,14]=='Silver':
                height_silver[j]=height_silver[j]+1
            if df.iloc[i,14]=='Bronze':
                height_bronze[j]=height_bronze[j]+1
        j=j+1
        #print('new_age')
    i=i+1
    j=0
    #print('new_row')

print(height_gold)
print(height_silver)
print(height_bronze)

Also, I would very much like to know how I would be able to find out which sport gets the most medals, which olympic year gave out the most medals and which country gets the most medals.

Now that I am here, I would also like to ask what else I could find out from this csv.file here -> the csv file/data i am using to get data to plot a graph All your help is very much appreciated! Thank you

Aucun commentaire:

Enregistrer un commentaire