Russia is doing very well. The US and China, for all their dominance of the raw medal tables are actually doing just as well as you’d expect.
Portugal, Spain, and Greece should all be upset at themselves, while the fourth little piggy, Italy, is doing quite alright.
What determines medal counts?
I decided to play a data game with Olympic Gold medals and ask not just “Which countries get the most medals?” but a couple of more interesting questions.
My first guess of what determines medal counts was total GDP. After all, large countries should get more medals, but economic development should also matter. Populous African countries do not get that many medals after all and small rich EU states still do.
Indeed, GDP (at market value), does correlate quite well with the weighted medal count (an artificial index where gold counts 5 points, silver 3, and bronze just 1)
Much of the fit is driven by the two left-most outliers: US and China, but the fit explains 64% of the variance, while population explains none.
Adding a few more predictors, we can try to improve, but we don’t actually do that much better. I expect that as the Games progress, we’ll see the model fits become tighter as the sample size (number of medals) increases. In fact, the model is already performing better today than it was yesterday.
Who is over/under performing?
The US and China are right on the fit above. While they have more medals than anybody else, it’s not surprising. Big and rich countries get more medals.
The more interesting question is: which are the countries that are getting more medals than their GDP would account for?
Top 10 over performers
These are the 10 countries which have a bigger ratio of actual total medals to their predicted number of medals:
delta got predicted ratio
Russia 6.952551 10 3.047449 3.281433
Italy 5.407997 9 3.592003 2.505566
Australia 3.849574 7 3.150426 2.221921
Thailand 1.762069 4 2.237931 1.787366
Japan 4.071770 10 5.928230 1.686844
South Korea 1.750025 5 3.249975 1.538473
Hungary 1.021350 3 1.978650 1.516185
Kazakhstan 0.953454 3 2.046546 1.465884
Canada 0.538501 4 3.461499 1.155569
Uzbekistan 0.043668 2 1.956332 1.022322
Now, neither the US nor China are anywhere to be seen. Russia’s performance validates their state-funded sports program: the model predicts they’d get around 3 medals, they’ve gotten 10.
Italy is similarly doing very well, which surprised me a bit. As you’ll see, all the other little piggies perform poorly.
Australia is less surprising: they’re a small country which is very much into sports.
After that, no country seems to get more than twice as many medals as their GDP would predict, although I’ll note how Japan/Thailand/South Kore form a little Eastern Asia cluster of overperformance.
Top 10 under performers
This brings up the reverse question: who is underperforming? Southern Europe, it seems: Spain, Portugal, and Greece are all there with 1 medal against predictions of 9, 6, and 6.
France is country which is missing the most medals (12 predicted vs 3 obtained)! Sometimes France does behave like a Southern European country after all.
The Caucasus (Georgia, Uzbekistan, Azerbaijan) may show up as their wealth is mostly due to natural resources and not development per se (oil and natural gas do not win medals, while human capital development does).
I expect that these lists will change as the Games go on as maybe Spain is just not as good at the events that come early in the schedule. Expect an updated post in a week.
The whole analysis was done as a Jupyter notebook, available on github. You can use mybinder to explore the data. There, you will even find several little widgets to play around.
Data for medal counts comes from the medalbot.com API, while GDP/population data comes from the World Bank through the wbdata package.