BD Rankings Methodology

 

Each set of rankings is obtained from an ordinary least squares (OLS) regression of a binary indicator of whether the home team won the game on a vector of team indicators, a neutral site indicator and a constant term.  For each game, the indicator variable for team X equals 1 if X is the home team, –1 if X is the visiting team and 0 if team X is not playing in the game.  More recent games receive more weight: the weights increase proportionally from 1 during the first week to 1.5 during the final regular season week (football) or conference tournament week (basketball).  The estimated coefficient for each team indicator gives an estimate of team strength in terms of the probability of winning relative to the benchmark team for which the indicator is excluded from the regression.  The constant term provides an estimate of the probability that the home team wins a game in which the competing teams are of equal strength.  The posted ratings are simply the estimated team indicator coefficients, normalized so that the worst team has a rating of zero.  The home field/court advantage is obtained by subtracting .5 from the estimated constant term. 

 

The reason our rankings ignore information on margin of victory is the oft-cited one that the goal of the contestants is merely to win the game rather than to win by a certain number of points (see Ohio State football, 2002).  It is also possible that, as the BCS suspects, ranking systems that incorporate margins of victory motivate a stronger team to run up the score on a weaker team in order to improve its ranking.  [Not that any teams are paying attention to our rankings, but in principle a ranking system should minimize the extent to which such incentives exist.]  For prediction purposes, though, systems that incorporate margins of victory are superior to those that do not, since they use additional relevant information.  Thus, our rankings are more useful in providing a measure of the quality of teams up to that point than in predicting future outcomes.  As a result, the regressions that generate our predictions (posted weekly for football but only for the NCAA tournament for basketball) use OLS with home team margin of victory, rather than the win-loss indicator, as the dependent variable.  Both the rankings and predictions regressions weight more recent games more heavily because predictions (for college bowl games over the past decade) are somewhat more accurate than when all games are weighted equally.

 

OLS has two big advantages over the maximum likelihood methods that many other ranking systems use.  The first is that it is extremely simple to implement and explain.  The second is that rankings (i.e. team indicator coefficients) are defined even for undefeated and winless teams.  This latter issue is irrelevant for late season NFL and basketball rankings, but poses a major problem for early season NFL and basketball and especially for NCAA football rankings.  A disadvantage of OLS is that, as is clear from the basketball ratings, predicted probabilities of winning a game can exceed one.  However, this problem does not affect the order in which teams are ranked or the relative magnitudes by which stronger teams are favored over weaker teams.

 

Frankly, we were quite surprised to find that, based on the uniqueness of our football rankings on Kenneth Massey’s comparison page (www.masseyratings.com/cf/compare.htm), no other ranking system uses OLS on win/loss outcomes.  One would think that, as the most basic way to evaluate the effect of an independent variable (team identity) on an outcome (winning a game), this would be the first method someone would try.  And although these results are not yet posted here, our method predicts college bowl game outcomes as well or better than the BCS systems over the relevant time period.  This is not meant to knock other systems, many of which are far more sophisticated than ours...but it does seem odd that such a simple method that (to our knowledge) has not been proven to be faulty for this purpose has not been previously used.  The only arbitrary feature of our method is the weighting scheme, but besides the justification stated above, it makes sense that, all else equal, teams playing better recently should be rated higher (a concept that is imbedded in both the polls and the “last 10 games” emphasis of the NCAA basketball tournament selection committee).  Plus, it turns out to not matter much, within a wide range of weights, exactly what weights are used as long as they increase throughout the season.

 

Feel free to e-mail questions or comments to me at jdesimon@coba.usf.edu.