Ranking college football teams with matrix algebra and hill climbing

Dozens of organizations and individuals produce computer-based college football rankings. Kenneth Massey’s page displays rankings for many of these models, in addition to traditional poll-based rankings. The approaches taken in the computer models vary widely in how they address the following issues: (1) margin of victory; (2) strength of schedule; (3) game location; (4) game timing in the season; and (5) adjustments for statistics other than game score, such as points per drive, yards per game, yards per play,  turnovers per game, and success rate per play in gaining the desired yardage. Most of these models are intended to be predictive, but some are intended only to rank teams. This page discusses some of the most prominent models.

We offer here another model that is intended merely to rank teams at or near the end of the season. We are not aware of any existing model that takes our approach, but we confess that we have not examined the methodologies for most computer models. Like some other models, our approach is based in matrix algebra. If there are six teams in a conference and they all play each other, the end-of-season matrix may look something like this, where a 1 indicates a win by the row team over the column team.

 

Team A

Team B

Team C

Team D

Team E

Team F

Team A

0

1

1

0

1

1

Team B

0

0

1

1

1

0

Team C

0

0

0

1

1

1

Team D

1

0

0

0

1

1

Team E

0

0

0

0

0

1

Team F

0

1

0

0

0

0

So, Team A would finish first in the conference with four wins. But Teams B, C, and D would all be tied with three wins. If the conference had a championship game, that tie would need to be broken. Conferences have a long list of tie breakers for these situations. Notably, in the ACC this season, five teams tied for second and Duke was selected for the championship game because its conference opponents had the highest winning percentage, which was the fifth tiebreaker; notably, the sixth tiebreaker would have been a computer-based ranking and the seventh and last tiebreaker would have been a random draw.

A perfectly sorted six-team conference, would have the following matrix, where all the 1’s are above the diagonal.

 

Team A

Team B

Team C

Team D

Team E

Team F

Team A

0

1

1

1

1

1

Team B

0

0

1

1

1

1

Team C

0

0

0

1

1

1

Team D

0

0

0

0

1

1

Team E

0

0

0

0

0

1

Team F

0

0

0

0

0

0

If we arranged the teams in any other order, some 1’s would unnecessarily fall below the diagonal. We can calculate the accuracy of any arrangement as follows:

  • 1.           Add each number above the diagonal weighted by its relative distance above the diagonal, and
  • 2.           Subtract each number below the diagonal weighted by its relative distance below the diagonal.

By relative distance, we mean the distance above the diagonal divided by the number of teams. So, the top right cell has a weight of 5/6, as does the bottom left cell. This approach implicitly penalizes bad losses.

The table above has a calculated value of 5.83. If we inappropriately move Team B to the bottom of the table, the table would have a calculated value of only 3.17. The matrix with the highest calculated value is the best arrangement.

In our first example table above, our approach would not help to break the tie for second place. Because all the teams played each other, any ordering of tied teams would produce the same calculated value. Our approach is useful only for the more typical college football situation, where all teams do not play each other. However, if we used the square root of the relative distance from the diagonal (penalizing bad losses even more) instead of the linear relative distance, our approach could break the ties in the first example table. It would rank Team B second, because of its head-to-head victories over Teams C and D. With that arrangement, there are only two numbers below the diagonal; there would be more numbers below the diagonal if we ranked Team C or Team D second.

As an example, we’ll consider the Big Ten Conference this year. Using our basic model described above that considers only wins and losses and including the championship game and using the linear relative distance from the diagonal, the following matrix has the highest calculated value.

We arrive at the optimal matrix using a hill-climbing algorithm. Starting with a reasonable guess at the best arrangement based on the conference standings, the algorithm iteratively swaps nearby rows and columns searching for a local maximum.

The basic model described above considers only wins and losses and doesn’t consider other issues discussed above, such as margin of victory, strength of schedule, game location, or timing of the game within the season. We can easily adapt the model to consider most of these issues. It can consider margin of victory by using that margin (or some other value representing that margin) instead of a 1 in the table. It can consider game location by awarding more points for road wins and fewer points for home wins. And it can consider game timing within the season by awarding more points for games later in the season.

We ran a second model that adjusted for both margin of victory and game location. Rather than using the actual margin, we used codes designed so a 1-point win was worth half as much as a win by 17 points or more, with the idea being that beating a team once by more than two scores should be valued the same as beating the team twice by 1 point. And we used an average bonus of 3 points for road victories, which is thought to be the average home field advantage. So, a road win by 1 point is worth as much as a home win by 4 points. Finally, we ran a third model that also adjusted for timing of the game during the season. We used a 5 percent weekly inflation factor, so a win in week 2 was worth 1.05 times as much as a win in week 1, a win in week 3 was worth 1.1 times as much as a win in week 1, etc. The following table shows the Big Ten Standings and the results of our three models.

Ranking

Big Ten Standings

Model

Considering

Only Wins and

Losses

Model Also

Considering

Margin of

Victory and

Game Location

Model Also

Considering

Game Week

1

Indiana

Indiana

Indiana

Indiana

2

Ohio State

Ohio State

Ohio State

Ohio State

3

Oregon

Oregon

Oregon

Oregon

4

Michigan (tie-4)

USC

USC

USC

5

USC (tie-4)

Michigan

Michigan

Michigan

6

Iowa

Iowa

Iowa

Iowa

7

Illinois (tie-7)

Minnesota

Washington

Washington

8

Minnesota (tie-7)

Washington

Illinois

Illinois

9

Washington (tie-7)

Illinois

Minnesota

Minnesota

10

Nebraska (tie-10)

Northwestern

Nebraska

Penn State

11

Northwestern (tie-10)

Nebraska

Penn State

Nebraska

12

Penn State (tie-12)

UCLA

Northwestern

Northwestern

13

UCLA (tie-12)

Penn State

UCLA

UCLA

14

Rutgers (tie-14)

Rutgers

Rutgers

Rutgers

15

Wisconsin (tie-14)

Wisconsin

Wisconsin

Wisconsin

16

Maryland (tie-16)

Michigan State

Maryland

Michigan State

17

Michigan State (tie-16)

Maryland

Michigan State

Maryland

18

Purdue

Purdue

Purdue

Purdue

Comparing the second and third columns of the table, our model that considers only wins and losses essentially just breaks ties in the standings. USC ranks ahead of Michigan because its wins over Michigan and Iowa were better than Michigan’s best win over Washington. Minnesota ranks ahead of Washington and Illinois because its worst loss to Northwestern was better than Washington’s and Illinois’ losses to Wisconsin. Washington ranks ahead of Illinois because of its head-to-head win, despite the fact that Illinois’ best win over USC was better. Northwestern ranks ahead of Nebraska, despite Nebraska’s head-to-head win, because Northwestern had a better win over Minnesota and Nebraska had a worse loss to Penn State. UCLA ranks ahead of Penn State because of its head-to-head win. Rutgers ranks ahead of Wisconsin, despite the fact that Wisconsin had two better wins over Washington and Illinois than Rutger’s best win over Maryland, because Wisconsin’s loss to Maryland was much worse than Rutger’s worst loss to Penn State. And Michigan State ranks ahead of Maryland because of its head-to-head win.

Comparing the third and fourth columns of the table, our model that considers margin of victory and game location produces several changes. Washington moves ahead of Minnesota, because it had more large margin wins. Illinois also moves ahead of Minnesota, because it had more large margin wins. Nebraska moves ahead of Northwestern, mainly because of its 7-point head-to-head home win. Penn State moves ahead of both Northwestern and UCLA (despite having a worse conference record than Northwestern), because it had more large margin wins and fewer large margin losses. And Maryland moves ahead of Michigan State, because its 17-point road win over Wisconsin was better than Michigan State’s 10-point head-to-head road win over Maryland.

Comparing the fourth and fifth columns of the table, the model that also considers the game week produced only two small changes. Penn State moves ahead of Nebraska despite having a worse conference record, because its 27-point head-to-head win occurred in Week 13. And Michigan State moves back ahead of Maryland, because its 10-point head-to-head win occurred in Week 14.

So, the only significant deviation from the conference standings in our models is Penn State’s move ahead of one or two teams with an additional conference win in the last two models. Penn State played a more difficult conference schedule than Nebraska and Northwestern, as Penn State played all three Big Ten teams in the College Football Playoff, while Nebraska didn’t play any of those teams and Northwestern played only Oregon. However, that fact does not move Penn State ahead of those teams in our model that considers only wins and losses. Penn State also had more large margin wins (2) and fewer large margin losses (1) than either Nebraska (0 wins, 3 losses) or Northwestern (1 win, 2 losses), which helps it in our model that also considers margin of victory and game location. And Penn State ended the conference season with three consecutive wins, which helps it in our model that also considers game week, while Nebraska lost three of its last four games and Northwestern lost four of its last five games.

Share