May 20, 2013; Seattle, WA, USA; Seattle Seahawks quarterback Russell Wilson (3) participates in organized team activities at the Virginia Mason Athletic Center Mandatory Credit: Joe Nicholson-USA TODAY Sports

The Problem With Player Stats In The NFL, And A Project On QBs

I talk a lot about metrics and statistical analysis here in 12MR. Whether its its my quest for a better analysis metric for evaluating defenses, or my mathematical power rankings, or just my general evaluation of players, I prefer everything to be grounded in solid data-driven analysis as well as tape study.

This idea has taken off in baseball, and that’s why I got my start as a writer coving MLB. I’m a scientist, and my mathematics and analysis background allowed to me jump right in. The problem was that  I’ve always been a football guy. I didn’t belong in baseball, so here and I am, trying to do my analytical thing for a sport that is the still in the dark ages when it comes to developing useable metrics.

Unfortunately, a lot of football doesn’t show up on a stat sheet.

For example, most running plays end with a tackle by a LB. Too bad not all tackles are equal. A tackle made 8 yards downfield isn’t the same as as one made up near the line of scrimmage. If you just look at the number of tackles, you don’t get that information, and unfortunately, that’s all most people have access to.

We call those tackles close to the line of scrimmage “stops.” Bobby Wagner had 69 of them last season, which is a lot compared to his NFL peers. That particular piece of information is out there if you know where to look, but few know it’s importance and even fewer bother to use it.

And what else happened to make that tackle possible? What about the DT who commanded a double team of the C and G and still wasn’t pushed back? That meant the G wasn’t able to peel off and get up to the next level and block the LB who made the tackle. Where’s that on the stat sheet? Or how about the DE who got 2 yards up field and set the edge, making it impossible for the RB to bounce the run to the outside? Where’s the stat for that?

Those things are just as important as the tackle. If they don’t happen, then the stop doesn’t happen either. Football is a team game. Everyone must do their job, and do it well, or the play breaks down.

This is why we have to be careful using stats to evaluate individual players. It is also why I often use Pro Football Focus for quantitative player analysis, and not traditional stats. PFF are the only ones who evaluate the players for doing those not-so little things that make each play work. It’s also why NFL teams pay big money for their data.

For a data-driven guy like me, this is all very frustrating. I’d love to putting together mathematical models that allow us to determine a player’s impact on a game relative to other players at other positions. I could still do that, but I’d have to use PFF’s data to do so, and I don’t think they’d be happy about it.

Unfortunately, this means I’m stuck looking at team-wide data for the most part in order to generate something meaningful.

There is one position though where we can do some statistical work and not have it be meaningless. I’m talking about QB. Currently, we have 2 measures that attempt to evaluate the position: passer rating and Total QBR.

Passer rating is archaic and fairly pointless. The different variables (TD, interceptions, completions, yards, etc.) are combined in a haphazard way that does not have any connection to points scored. It also doesn’t take into account some of the the thing that QBs do, like scrambling and avoiding sacks.

QBR, on the other hand is quite new. ESPN’s staff put it together in an attempt to fix some of the problems that the passer rating stat has. The problem is that they took it too far. Sacks seem to be the biggest factor in determining the rating, which is a bit silly.

The system is also not purely stats based, and it is hardly objective. Plus, the fact that they refuse to divulge the formula used only adds to the problem. I simply refuse to use Total QBR because ESPN refuses to tell us what actually goes into the metric.

That leaves us with almost nothing for a true QB rating system. After a ton of prodding by a few other analysts  I’m going aim to fix that.

Over the next year, I’m going to be working with some of the best numbers guys in the business to try and develop a QB rating metric that improves on the archaic passer rating, but doesn’t get into the stupidity that is QBR. It’s likely to be a comically frustrating voyage, so I’m going to share my progress here on 12MR.

At this point, I have no idea what this is going to look like, but i’m sure I’ll have more grey hair when it’s completed.

Next Seahawks Game View full schedule »
Thursday, Sep 44 Sep5:30Green Bay PackersBuy Tickets
Dick's Sporting Goods presents "Hell Week":

Tags: Seattle Seahawks

  • bobk333

    PFF grading, like the ESPN Total QB Rating, is a black box. As far as I can tell, they only give you the grade for each player for each game and don’t tell you how they came up with the grade. They do have a general explanation on things they look for – – but they don’t show you how they apply it in actual scoring. You just have to trust them. I don’t think they are very confident about their grading system because it would be easy for them to release their grades for every play and every player (as they claim they do.) It would be extremely useful for the football analysis world if they released their play charting/grading and allowed discussion and debate on the grades for every play.

    It’s been tried in the past (e.g. Football Outsiders), but we still need a general *open data* play charting and grading mechanism. I believe a great system can be created using the all-22 coaches film as a basis.

    Note: has very recently started looking into the PFF grading system (Somebody *had* to do it):

    Excerpt from Ninersnation:

    “There has been some concern in different corners of the football interwebz over the usage of Pro Football Focus’ grades as an authority on player performance.

    “Before stumbling upon the controversy, I was already brewing in my head an idea to go through their grades for every game the 49ers played last year. The idea was to find some of the best and worst performances of each game, as judged by PFF, and review the most interesting ones on the All-22 and break down plays.

    “Now that there seems to be a growing interest in PFF and the reliability of its grades, this idea to study film based on their grades has turned into an opportunity to compare the two.”


    • 12thMan_Rising

      I’m actually very familiar with PFF’s system. it is a bit of a “black box” as you say, but I’ve worked with them in the past and know how their data is collected. It’s a solid model, but it isn’t without its faults.

      As for passer rating, it is a decent indicator, but it’s seriously flawed. Think of it as OPS in baseball. it tell you something, but why should OBP and SLG have equal weights? and is a double truly worth twice as much as a single, and a tripple worth 3 times as much? All 3 only score 1 run if there’s a player on 2nd or 3rd. That’s what led to the creation of better measures, like wOBA.

      Passer rating is much the same. The variables are cobbled together in a haphazard manner. Someone played with the formula for a couple hours until it created a result that “looked” good. Surely we can do better.

  • bobk333

    Good luck in coming up with yet another quarterback grading system. It’s very difficult to keep out factors like the strength of a team’s running game and the strength of a team’s defense from biasing the rating. The NFL passer rating, for example, includes TD percentage, which would be higher if your defense and special teams put you in better field positions and if your running game was good at moving you down the field closer to the goal line and converting short yardage situations to give you more sets of downs. Football Outsiders tries to do it with DVOA and DVOR but they introduce a whole new set of problems.

    For the past two years, every since the ESPN’s TQR was released, Kerry Byrne, a writer for Cold Hard Football Facts and has been a strong advocate for the archaic, 40-plus years old, NFL passer rating:

    Byrne: “Put most simply, you cannot be a smart football analyst and dismiss passer rating.”

    He makes a strong argument that the passer rating is (surprisingly to me) a great indicator of team success. He focuses on *team* success, so he accepts the biases of strong supporting running games and defenses, but it is surprising that a stat that is geared only towards the passing game and has only minimal aspects of defense and running is such a seemingly good indicator (I don’t think we should use the term “predictor” here) of success.

    Weird. Passer rating may or may not be a good indicator of passing success by an individual quarterback (which can be said of any ranking system devised in the past, present or future), but it is highly correlated with team success.

    The NFL has ALWAYS been dominated by teams that dominate the skies, as measured by passer rating.

    • an incredible 40 of 69* NFL champions (58 percent) since 1940 finished the year No. 1 or No. 2 in Passer Rating Differential

    • 67 of 69* champions (97 percent) since 1940 finished the year ranked in the top 10 in Passer Rating Differential.

    For a little perspective, consider that 68 of 69 champions finished in the top 10 in scoring differential. That’s right. Passer rating is nearly as effective at identifying winners as points.

  • bobk333

    For the Bobby Wagner “stops” scenario, you can use the play-by-play blurbs.

    Alfred Morris right end for 8 yards (tackle by Bobby Wagner and Chris Clemons)


    • 12thMan_Rising

      true, but it’s a lot of work to compile that for everyone in the league. That’s why I let other ppl do it for me.

      • bobk333

        Where is the compiled data available?

        • 12thMan_Rising

          I get that particular stat form Pro Football Focus. Other stats, like 4th quarter and red zone specific stats I get from it’s a bit of a pain to have to look across many sites for all the info I need, but I’ve gotten used to it.

  • Husky Hawk

    I too am a scientist and a bit of a math guy. Some things that I would suggest.

    1) The ability to push the drive and score. 4000 yds is meaningless if you don’t put up points.

    2) Red zone effectiveness

    3) Collapsed pocket effectiveness. Big points for positive yardage and some smaller points for throwing the ball away. Negative points for sacks. In this way, a bad offensive line is calibrated for the QB. If the QB gets sacked half the time the pocket collapses, but half the time they connect with a pressure pass or run the ball for positive yardage then the sacks get canceled out.

    4) Eliminate (true) Hail Mary interceptions/TDs. If it was a planned route then that is different than a bomb “winner takes all” grab-fest.

    5) Clock management. Does the QB run an effective up tempo offense when behind….does he slow the drive to a crawl when ahead?

    6) Tare for Defense. Is the QB playing against the Bears or the Titans? Take a defense rating and standardize it so the final QB rating is adjusted to the level of the defense.

    I realize all of this is probably next to impossible to implement, but as you are the numbers guy, see if any derivation of these concepts can be utilized. At the very least, think of this as you develop your rating. Good luck.

    • 12thMan_Rising

      This is a great start. I’d add third down effectiveness to the list. The QB must be able to extend drives if they are going to score, otherwise it’s too many FGs.

      I think we should also not penalize the QB for dropped passes, or interceptions that bounce off a receiver’s hands. My concern is that the more of these adjustments that we make, the more difficult it will be to reliably provide results.

      Of course, we need to see how often these things happen, and if they don’t happen relatively equally to everyone. There’s lots of work ahead.

  • Hawkman54

    Wonderful project- Hope you can persevere and come up with a good system.
    Hopefully it can include the dreaded WR /RB interception- When a receiver gets both hands on a ball but fails to catch it ,bats it up , loses it to a defender and the QB ends up with the interception, I say that is hogwash. They get paid to catch the ball! If two hands are on it they should catch it! It shouldn’t go against the QB.

  • skeletony

    I may be wrong about this but I could have sworn that ESPN DID lay out what went into their Total QBR formula back when it first appeared. I am remembering a special show that featured John Gruden and others detailing the whats and whys of the thing.
    Of course even if that is true it doesn’t mean that the Total QBR is necessarily great or whatever so I applaud your efforts.

    • 12thMan_Rising

      They did a thing that talked about it the basics of what went into the formula. They explained WPA and expected points as such, but didn’t detail the math. Or if they did, I didn’t see and it’s not available anywhere now.

      Plus, WPA for individual players is horribly flawed. Take the stuff in my article above, like the DT and DE doing their jobs that help the LB make the tackle. WPA assumes that if you’re good at collecting stats like sacks and tackles, then you’ll be equally good at doing those other things. That’s an extremely stupid assumption to make, and is painfully easy to demonstrate it’s fallacy.

      • skeletony

        Ah, I see now. I guess that is why I don’t trust my own memories on such matters (plus even if they had delved into the mathematics I would not have paid much attention anyway).

        Sounds like you have a lot of work ahead.

    • bobk333

      The ESPN QBR is known to be a black box (to everyone outside of ESPN) which makes it unusable.

      I think Pro Football Focus grading is the same way. We cannot fully accept their grades unless we see how they are derived. Until we see the grades on each player on each play, their grades – on someone like Breno Giacomini, for example – might as well be just another personal opinion that we should take with a grain of salt.