The Enigma Of Turnovers, Part 2
By Keith Myers
A couple weeks ago, I posted the initial results of some statistical work I was doing that indicated that turnovers have no correlation to wins and losses. While the numbers were quite clear, the result just bugged me. It was so counter-intuitive, and ran so contrary to my beliefs about the game of football, that I’ve had a hard time accepting them.
Almost as soon I published that article I started looking at the data again. After double and triple checking my work to make sure there wasn’t some horrible mistake, it was time to look at turnovers from a different angle. So instead of looking at season long trends, I began looking at individual games. Unfortunately, using 10 years of data means looking at thousands of games, which is why this follow up post has taken so long to get published.
There’s a lot of information here, so to start making sense of it, lets look at some simple histograms, starting with the turnover differential for the winning team.
As you can see, it forms a fairly symmetric normal distribution around the mean of 1.15. Interestingly, the standard deviation is a rather high 1.76. This means that while “on average” the winning teams has about 1 less turnover than the losing team, there’s a tremendous about of “noise” in the data and thus, anything is still possible.
Winning the turnover battle in a single game hardly guarantees victory, but it does help. And, looking at that graph finally game me a way tease that out of this data set with a high degree of variance. But before we get too ahead of ourselves, there are 2 other histograms I’d like you to look at. The first is the number of turnover committed by winning teams:
I found this to be particularly interesting. The winning team averages just over 1 turnover per game. So clearly turning the ball over isn’t an automatic ticket to losing. Over the 10 year period, there have been over 283 games in which a team turned the ball over 3 or more times and still won the game. That’s greater than 10% of the games played.
Looking at the turnovers by the losing teams shows much more of what I expected:
As we already expected because of results on first graph, the average number of turnovers by the losing teams is about one higher than that of the winning teams. There really isn’t anything all that interesting here results-wise. I only included it for completeness, and incase anyone wanted to compare the data from the winning and losing teams.
I must warn you not to make causation assertions from this data so far. While the winning team in any game does “on average” tend to win the turnover battle it does not mean that the turnovers were the cause as to why a team wins and loses. It is equally likely, that the act of losing actually causes turnovers to happen, and conversely, that the act of winning prevents them.
What I mean by this is that, late in games, the team that is behind tends to take more chances and play more aggressively than they normally would. This aggressive play leads to higher probability of turning the ball over, and thus more turnovers. Conversely, the team in the lead tends to play very conservatively and takes very few chances that might lead to turning the ball over. This conservative play decreases the probability of turning the ball over, and thus actually leads to having less turnovers than that team might have had if that team wasn’t ahead.
The idea isn’t far fetched if you think about. How often do we see games in which a team “seals the win” with a last minute interception? It actually happens pretty often. QBs do tend to throw more picks when they are behind and the clock is running out.
I’m not meaning to say that this is the only explanation for the data above. The data just tells us correlation, and not causation.
I honestly believe that losing and turnovers are interrelated causal variables. What that means is that turning the ball over leads to being behind in games, and that being behind in games leads to more turnovers late in the game. I think this is why turnovers tend to happen in bunches. I should state that, at least at this point in the analysis, that this explanation is just conjecture.
So as you can see the results from looking at individual games aren’t all that different from looking at the full seasons, but there is clearly some new information that can be taken from the results. There’s actually a lot of information; too much for one blog post in fact, so I’m going to have to split things up a bit.
This article is already way to long, so I’ll leave the probabilistic analysis for the next installment. I will leave you with this though: the next (and final) part of this series is where you’ll find the truly interesting results. That’s right, I saved the best for last.