Basically, I am new to data analysis and I got a dataset that I would like to analyse and get some graphs to test out hypotheses and learn more about the data I got about the olympics.
Now, I would like to find out which age gets the most number of gold, silver and bronze medals and same goes for height.
This is the code I have created, I think it works (i am not sure) but takes like 20 minutes to process and the format is weird which gives me trouble putting in a graph. I would like to know how i can cut the processing time significantly shorter and how I would be able to graph it ->
#calculating number of medals each person has
j=0
i=0
height_gold=[0]*230
height_silver=[0]*230
height_bronze=[0]*230
while(i<271116):
while(j<230):
if df.iloc[i,4]==j:
if df.iloc[i,14]=='Gold':
height_gold[j]=height_gold[j]+1
if df.iloc[i,14]=='Silver':
height_silver[j]=height_silver[j]+1
if df.iloc[i,14]=='Bronze':
height_bronze[j]=height_bronze[j]+1
j=j+1
#print('new_age')
i=i+1
j=0
#print('new_row')
print(height_gold)
print(height_silver)
print(height_bronze)
Also, I would very much like to know how I would be able to find out which sport gets the most medals, which olympic year gave out the most medals and which country gets the most medals.
Now that I am here, I would also like to ask what else I could find out from this csv.file here -> the csv file/data i am using to get data to plot a graph All your help is very much appreciated! Thank you
Aucun commentaire:
Enregistrer un commentaire