This is a “Stats Explained” article and comes in two parts- Overview and Discussion. Overview is for those of you not interested in stats by themselves- it will tell you what the stat means and how to interpret it. Discussion talks a little bit more about how I generated the stat, the mathematics I applied, what analysis I used, etc. Comments sections on these articles will remain permanently open and I welcome your questions.
G.A.W.P. is pretty simple. There are four well-curated, reliable, and consistent sources of NCAA football predictions and team metrics- ESPN’s FPI, Brian Fremeau’s FEI, Bill Connely’s S&P+, and the Las Vegas lines that are released a week ahead of games. All of these predict results in slightly different ways, and this means that they all have biases- not in the sense of the people creating them having an agenda, but in the statistical sense that they inevitably over-emphasize some traits and under-emphasize others.
I used a bunch of math to extrapolate win percentage probabilities from each of the four metrics; and then use all four metrics’ extrapolated win percentage to get a “Global Win Percentage”, a percentage chance that a team will win each game on its schedule. Then I “Adjusted” it- I make a few tweaks to the numbers here and there based on the biases I mentioned above. What emerges is a win percentage that is more thorough than any one metric by itself.
This isn’t to say that FPI, FEI, S&P+, or Vegas are bad- they each do an awesome job. G.A.W.P. isn’t a competitor to those metrics, but instead an attempt to build on and develop their work. I hope that it will prove an interesting lens through which fans can view the PAC 12.
Extrapolating win percentages from FPI is trivial- they post them. S&P+ essentially is a points +/- against the average football team. Alabama is a 35, New Mexico is a -35, and everyone else falls in between. Vegas is similar- except they don’t tell you the team’s +/-, just the difference between two teams playing on that day. Those two metrics I extrapolated win percentages by plotting the difference between the score differential and the projected score differential, and took a 10,000 instance sample. From there, it was just a matter of taking a trinomial regression of the scatter plot, sussing out the average home field advantage, and voila. FEI doesn’t display its numbers in terms of points, but as far as I can tell it’s just a matter of decimal placement. Alabama is .35, New Mexico is -.35, and everyone else is in between. So I used a very similar analysis for FEI.
When it comes to biases, I checked for a host of different things. Conference bias, ‘brand’ bias, a defensive bias, an offensive bias, a bias against teams who thrive off havoc plays, and quite a few others. I created a master list of FBS vs. FBS football games from 2012-2016, including a % chance to win for each game. I then compared population subsets (Like PAC 12, SEC, Brand Teams) to the global population averages, or developed simple metrics to measure teams in those areas, and ran correlations on that data. When I found a bias, I experimented with adjustments to maximally reduce the correlation coefficient.
Finally, I have added a coaching adjustment. I measure the actual performance of the team over up to four years (depending on the tenure of the current head coach at the school) against what G.A.W.P. would have predicted, average it out, and apply it to the adjustment.
Everything I’ve done is going to be subjected to rigorous testing and analysis at the end of each season, and the model will hopefully improve substantially. I’ll keep everyone updated with how that goes.
Let’s play some football!