Nerd Alert: Does Stadium Even Matter?
Earlier this year, Equiratings posed a very interesting question about scoring in three-day eventing: does stadium matter? I really wanted to know the answer to this question but the article left something to be desired, for me at least.
The article is not wrong; when looking at the winner, stadium definitely plays a statistically significant part. However, I think what most people imagine when they think of scoring is the large jumps up and down the board at the end of cross country. Maybe your horse isn't strongest at dressage, however if they can gallop the snot out of a big course you can still have a shot a top ten finish even if your dressage is decidedly not top ten.
In that sense, stadium is not influential. A great stadium round isn't going to make up for a bad cross country or dressage. While a rail might knock you from the top spot if the scores are particularly tight, in recent memory many competitors had at least a rail in hand going into stadium (Badminton 2019 not withstanding, which was by far THE most exciting finish to a4* 5* I can ever remember).
I compared Top Ten and Non-Top Ten finishes for the scores I was examining* (dataset graciously provided by Emma #nerdwithfriends) to see if it was my imagination, or if stadium did not in fact play a huge impact moving people large jumps up and down the final board.
So, yanno. I made some graphs.
For those of you not familiar, these are box and whisker plots. They show the average and any outliers for the scores. The dark line somewhere near the center of the box is the average. For dressage and cross country, the dark line is not the same between Top Ten and Non-Top Ten finishes. However, for the stadium scores you can see they are almost identical. A cursory analysis of variance confirms what is visible in the charts: the difference between average stadium scores in top ten and non-top ten finishes is not statistically significant.
What does this mean in plain English? It means on average that if you finish in 45th, your average show jumping penalties would be the same as someone finishing in 5th. The same is not to be said for cross country and dressage.
You can see a real life application of this in the final scores for the 2019 Kentucky Three Day:
There was very little shuffling of the leaderboard once cross country day was over. Most of those in the top ten on Sunday were already there on Saturday, with the exception of Leslie Law.
As with any analytics question, once you've proved or disproved something the question becomes "now what?" Or sometimes, "who cares?"
The 'who cares' would be, do we care? What is stadium's purpose in three day eventing? Is it an afterthought or should it have equal weight as its two siblings? The 'now what' would be dependent on the answer to that question. If stadium is just meant to separate first place from second, then it is already accomplishing that. But if we want it to level the playing field even more, we'd have to change something.
Which, of course, I have some ideas about. But what do you think??
The article is not wrong; when looking at the winner, stadium definitely plays a statistically significant part. However, I think what most people imagine when they think of scoring is the large jumps up and down the board at the end of cross country. Maybe your horse isn't strongest at dressage, however if they can gallop the snot out of a big course you can still have a shot a top ten finish even if your dressage is decidedly not top ten.
Usually last after dressage... |
Still managed a top six finish (and lots of seconds!) at almost every event we went to. |
In that sense, stadium is not influential. A great stadium round isn't going to make up for a bad cross country or dressage. While a rail might knock you from the top spot if the scores are particularly tight, in recent memory many competitors had at least a rail in hand going into stadium (Badminton 2019 not withstanding, which was by far THE most exciting finish to a
I compared Top Ten and Non-Top Ten finishes for the scores I was examining* (dataset graciously provided by Emma #nerdwithfriends) to see if it was my imagination, or if stadium did not in fact play a huge impact moving people large jumps up and down the final board.
So, yanno. I made some graphs.
For those of you not familiar, these are box and whisker plots. They show the average and any outliers for the scores. The dark line somewhere near the center of the box is the average. For dressage and cross country, the dark line is not the same between Top Ten and Non-Top Ten finishes. However, for the stadium scores you can see they are almost identical. A cursory analysis of variance confirms what is visible in the charts: the difference between average stadium scores in top ten and non-top ten finishes is not statistically significant.
What does this mean in plain English? It means on average that if you finish in 45th, your average show jumping penalties would be the same as someone finishing in 5th. The same is not to be said for cross country and dressage.
You can see a real life application of this in the final scores for the 2019 Kentucky Three Day:
There was very little shuffling of the leaderboard once cross country day was over. Most of those in the top ten on Sunday were already there on Saturday, with the exception of Leslie Law.
As with any analytics question, once you've proved or disproved something the question becomes "now what?" Or sometimes, "who cares?"
The 'who cares' would be, do we care? What is stadium's purpose in three day eventing? Is it an afterthought or should it have equal weight as its two siblings? The 'now what' would be dependent on the answer to that question. If stadium is just meant to separate first place from second, then it is already accomplishing that. But if we want it to level the playing field even more, we'd have to change something.
Which, of course, I have some ideas about. But what do you think??
In another life this blog would be called Drag Queens and Data and it would just be me analyzing things and explaining them with RuPaul gifs |
*Data background and caveats: the data was pulled from EventEntries.com. Live scoring for 25 international events in 2018 in the US were used. The size of each class was not standardized which could effect how many are in top ten/non top ten for each class. The sample size for the long format higher level events is not credible and more data should be collected. These analyses were conducted only on finishers, and no account was taken for eliminations at this time. I'm working on a way to work in eliminations. Any time I post anything about analytics I am 150% open to feedback, ideas and collaboration. Thanks again Emma for collecting and scrubbing the numbers used in this analysis!
Ha I love it!! Did you end up using the set I built for this?? What a PITA but so many interesting findings. One of these days I’ll publish some of my outcomes too. Eventually, right? .... if anyone else wants a shot at evaluating the data lmk and I’ll send ya the link ;)
ReplyDeleteYES I did - updated above to credit you.
DeleteI'll admit I had an unfair advantage... I ended up doing my final project in one of my grad school courses about it. All I kept thinking was "I AM CREATING SO MUCH BLOG FODDER"
Oh and I guess I also learned how to use R. Which may have been the point of the course. I'm not sure.
ME me me me me Emma! I'll shoot you an email.
DeleteYum, MATH. (And I would 100000% read the blog Drag Queens and Data.)
ReplyDeleteI suppose, in theory, stadium's supposed to show your horse still has the energy and capability for more technical jumping after cross country and I know it *can* be the thing that eliminates a competitor. But it does seem like, at least at the international level and in these data, it doesn't add a whole lot of competition to the whole shindig. It's just sort of a pass/fail situation.
This is really interesting, thank you!
Saying it's a pass/fail is a really good point I hadn't thought of. If that's the intent, then it definitely is accomplishing that. It just seems anticlimactic to end the competition in a pass/fail. Maybe that's why they run the CCI-S backwards? It's a bit more exciting.
DeleteI vote we just get rid of it altogether, yes? Please?
ReplyDeleteno way it's my best phase!!!!!!!
DeleteOk then how about we can tag someone else in to do stadium in our stead? OR - give people the option of taking 4 faults added to their score and NOT having to ride stadium. Throw me a bone here.
DeleteAmanda I agree with your idea. Let someone else ride my horse in stadium (HA) OR i will take the 4 faults :) This is way too much math this early for my brain and not enough coffee yet :)
Deletemichelle come back and read it later :P
DeleteI will ride y'alls stadium for you. I'll jump on that grenade. You guys have nice horses.
I wonder if it would make it more significant if it was run like show jumping? As in having time count rather than just staying in the time allowed? Or is that already a thing and I'm just ignorant (which is likely the case as I do not event)?
ReplyDeleteNo it's not a thing! But it's definitely something I considered. Or scoring it in such a way that instead of it being you vs the showjumping course it would be you vs the other competitors - maybe a ranking type scoring?
DeleteI also played around with some different dressage coefficients with interesting results.
I AM SO THERE FOR DRAG QUEENS AND DATA
ReplyDelete150000% behind Drag Queens and Data. God, I love numbers.
ReplyDeleteThis got my so excited that I literally just created an entire section on my site for compiling things like this we all do - cough Emma cough Megan cough.
ReplyDeletegirl you just wait. I did my entire semester final project on this I have a 10 page report of which this was like... one.
DeleteI'm TOTALLY fine with doing away with SJ. If we could replace it with a phase where one needed a T-Rex costume, I'd be completely down.
ReplyDelete