You Are Here > Pro-Football-Reference.com > PFR Blog > NFL and College Football Analysis

Which colleges produce the most NFL talent?

Posted by Doug on Monday, November 17, 2008

First a plug: if you’re an NBA fan, you should check out what Neil Paine (with assists from Justin Kubatko) has been doing with the basketball-reference.com blog. It’s good hoops-related reading in the same spirit as this blog.

Neil recently did a couple of posts on which colleges have produced the most NBA talent. Inspired by that, and my NFL all-franchise team posts (AFC - NFC) from this summer, I’ve decided to create a “team” from the talent produced by each college, and rank them.

The rules are essentially the same as those for the NFL all-franchise team posts, but with a few college-related extras added:

1. If you haven’t read about my Approximate Value (AV) method for rating players, you should read about it right here.

2. A player is only eligible to play for his final college, if he attended more than one. For example, Troy Aikman can be on the UCLA team, but not the Oklahoma team.

3. Keep in mind that these lists are ordered by NFL production. Archie Griffin was one of the best college football players of all time, but he loses his spot on the Ohio State team to the merely-very-good-in-college Robert Smith, who had a much more illustrious NFL career.

4. The AV systems gives a player a score for each player season. To combine these into a career number, I take 100% of the player’s best season, plus 95% of his second-best season, plus 90% of his third-best season, and so on.

5. I’m only comfortable (for now) applying the AV methodology to seasons 1950 and later. Players who debuted before 1950, however, are included if their post-1950 seasons alone merit inclusion. In this case, they have a ‘+’ after their AV score to remind you that their career AV is (probably) higher than the number shown.

6. To avoid 4-3/3-4/5-2 issues, I gave each defense 12 players, including two DT/NTs, two DEs, two OLBs, and two ILB/MLBs. I have also now lumped all safeties together instead of distinguishing between free and strong safeties.

7. Because of the slippery and changing nature of defining what a fullback is, I simply decided to go with two RB/FBs, instead of an RB and an FB.

8. What to do with players whose position was “End” in the 50s. Are they tight ends? Wide receivers? To deal with this problem, I’ve lumped TEs, WRs, and Es of all years into one category, which I’ve called ‘RC’ (ReCeiver), and I’m allowing four of them per team. After all, if the defense is playing with 12, I should allow the offense the same luxury.

With all that out of the way, here are the top ten colleges, ranked by the sum of the values of their 24 players:

(Continued)


A Month of Heartache

Posted by JKL on Thursday, November 13, 2008

I’ve had the joy of watching the team I root for, the Kansas City Chiefs, lose three games in a row in which they had a lead in the second half, and three games which they had a chance to win at the end of the game. It got me wondering what the longest streak of close losses was, and how close the Chiefs are.

As it turns out, there have been a total of 101 occasions since 1970 when a team has lost three consecutive games by 7 points or less in each game. However, there have only been 19 teams that have gone on to lose a fourth consecutive game in by 7 points or less. The Chiefs are one close loss away from joining some fairly, well, elite is not the word, let’s say rare company. Here is the list of teams, sorted in alphabetical order by team:

(Continued)


Punts Inside the Twenty

Posted by JKL on Friday, October 31, 2008

I was fully prepared to write a post about how the statistic “punts inside the twenty” is completely worthless and misleading. After doing a little research, I don’t think it is completely worthless. I do, however, think that it has the potential to be a bit misleading.

When you think about the statistic, what comes to mind? For 90% of you, the response is probably, “I don’t think about punters and their statistics.” Fair enough. For me, it conveys the preciseness or accuracy of the punter in pinning the opponent near the goal line. But if that’s the purpose, it seems like the statistic is too broad to accomplish that. Is a punt that forces a fair catch at the 17 (which counts as a punt inside the twenty) really that much better than a punt that bounces inside the 5 and goes into the end zone? In fact, I think you could argue the latter is better so long as the team can down a certain percentage of those punts inside the five, even though that punter will have a lower “inside the twenty” percentage by risking more touchbacks.

Further, punts that are boomed from a team’s own end of the field at full power, and may bounce past a returner and come to rest at the 15, are counted the same as a punt from the opponent’s 40, where the punter drops a punt down inside the opponent’s 10 or forces an uneasy fair catch near the goal line. To borrow some golf terms, it’s the difference between using a driver and a sand wedge. In my opinion, the first type of punt (the “driver” that happens to end up inside the twenty) is already adequately reflected and rewarded in both the gross and net punting average of that particular punter. I don’t think it needs to be categorized the same as the latter, where the punter is sacrificing gross average for accuracy. Also, I don’t think punts that are conservatively dropped at the 15 for an easy fair catch should be recognized the same as punts that aggressively force a fair catch decision inside the 10. Thus, I would prefer that we use a “punts inside the ten” or “punts inside the ten versus touchback ratio”, as a smarter, more focused, statistic. Or better yet, the average opponent’s starting position following mid-field punts.

Now to the numbers. I looked at the punts so far in the 2008 season. I focused on those punts that occurred between the punting team’s own 40 yard line and the opponent’s goal line, to which I will refer as “mid-field punts”, separating them from punts that are deeper in the team’s own end and are more based on raw punting distance, rather than accuracy and hang time.

League-wide, there have been 425 punts in the mid-field zone in 2008. 60% (254) of those would officially be counted as punts inside the twenty. 22% (94) of them resulted in touchbacks. Touchbacks are not the worst result, though, as the remaining 18% were punts that were either returned past the 20, punted short of reaching the 20 in the first place, or shanked or hit out of bounds prior to the 20 yard line. None of these punts were returned for a touchdown in 2008, or returned across mid-field, with the average starting position for these bad punts being the 28.7 yard line. The “punts inside the twenty” from mid-field punts, conversely, had an average starting position of the opponent’s 10.7 yard line.

The distribution of mid-field “punts inside the twenty” is not uniform. Here are the number of punts that have been downed, kicked out of bounds, or returned to each yard line inside the twenty in 2008:

Yard		Number
1		6
2		14
3		12
4		6
5		9
6		7
7		12
8		15
9		19
10		22
11		19
12		13
13		18
14		19
15		9
16		16
17		17
18		15
19		6

As you may have guessed, more drives are started at the 10-yard line as any other. Though we see some fair catches inside the 10, many teams have a 10-yard line rule for fielding a punt, and we see the numbers dip in the yards that follow toward the goal line, until we get inside the 3.

So how do the individual teams fare? Which are the best at downing punts and pinning opponents when they have to punt in this mid-field zone? If you look at the official punting statistics at NFL.com, you will see that Chicago and Tampa Bay lead in “punts inside the twenty”, each with 18, and thanks to fewer punts, Chicago has the highest “inside the twenty” percentage, with 47% of all Chicago punts qualifying. However, both of these teams finish roughly middle of the pack in my evaluation. Chicago, for example, has 14 of its 18 “punts inside the 20″ from these mid-field punts I measured, but half of them were downed between the 16 and 19 yard line.

And the team that, quite handily, is leading in opponent starting position, with four punts downed inside the 5 and three more inside the 10 (compared to 1 touchback and no other bad punts), is only middle of the pack in the official NFL statistic. Here is a chart for all mid-field punts so far in 2008, listed by team, and sorted by average opponent starting position. Going across, the columns show the opponent’s average starting field position following a punt, the percentage of mid-field punts that would officially be a punt “inside the twenty”, the total number of mid-field punts, and the number of punts that resulted in touchbacks (TB), and at each of the respective yardage groups. For example, Jacksonville is dead last in average starting position, with four touchbacks and five punts that resulted in the opponent taking over past the 20, out of 15 total punts.

Team	Opp.St.	IN20%	Punts	TB	1-5	6-10	11-15	16-20	21+
===========================================================================
phi	8.9	0.91	11	1	4	3	1	2	0
pit	11.7	0.80	15	1	2	5	4	2	1
nyj	11.9	0.63	8	1	2	0	2	2	1
det	12.5	0.80	15	2	0	5	4	4	0
clt	12.7	0.73	15	1	4	2	4	2	2
atl	13.2	0.78	18	3	3	4	5	2	1
ari	13.2	0.54	13	4	2	2	1	3	1
ram	14.0	0.73	11	2	1	4	2	1	1
buf	14.1	0.67	12	4	0	3	5	0	0
sdg	14.1	0.63	8	2	1	1	3	0	1
car	14.3	0.73	11	3	0	3	3	2	0
chi	14.3	0.72	18	1	1	5	1	7	3
tam	14.8	0.72	18	2	2	3	5	3	3
min	14.9	0.53	15	6	2	2	2	2	1
sfo	15.0	0.67	9	2	1	2	1	2	1
cin	15.1	0.57	14	2	3	4	1	0	4
oak	15.1	0.65	17	5	3	1	5	2	1
mia	15.5	0.60	15	3	3	2	4	0	3
cle	15.8	0.50	14	5	2	2	2	1	2
was	16.1	0.53	19	1	1	2	4	5	6
dal	16.7	0.47	15	5	0	2	3	3	2
htx	16.8	0.46	13	3	2	1	0	4	3
nyg	16.8	0.60	10	2	0	3	2	1	2
den	17.0	0.60	5	2	0	0	1	2	0
oti	17.0	0.54	13	4	1	1	3	2	2
gnb	17.4	0.44	9	4	0	2	0	2	1
rav	17.5	0.50	14	2	1	2	3	1	5
kan	17.7	0.53	15	3	3	1	2	2	4
new	17.7	0.38	16	8	1	1	4	0	2
sea	18.0	0.33	12	4	2	1	1	0	4
nor	19.7	0.42	12	2	0	3	0	2	5
jac	20.4	0.40	15	4	0	3	0	3	5
===========================================================================

The best punter in these mid-field punting situations has so far been Sav Rocca and the Philadelphia Eagles coverage unit. Four punts inside the 3 and Seven within 10 yards of the goal line compared to only one touchback is pretty impressive. At season’s end, I’m going to try to run the second half numbers to see how much correlation there is between how a punting team does in the first half of a season and the second half, to see how much luck and noise is included in this data.

Finally, I also looked at how the percentage of punts a team has in this mid-field zone affects the punter’s overall gross average. For example, Denver and Saint Louis have the two lowest percentage of punts occurring in the mid-zone. Only 22% of Denver’s punts have occurred in this part of the field, and 27% of Saint Louis’ punts. These teams, not coincidentally, happen to be first and third in gross punting average to this point in the season. When I run the correlation for all teams, the cc between mid-field punt percentage and gross punting average is -0.36, suggesting it is a real factor.

Back in July, I wrote about Pro Bowl Punters and how an over-reliance on gross punting average seems to have dominated Pro Bowl selections and generally results in punters on bad offensive teams and in warm climates being selected. This info further suggests that selectors need to place information into context, and in the case of punters, consider that the distribution of punts for each punter is not equal. Punters who get to hit driver all the time shouldn’t be overvalued over those with a good sand wedge.


BCS thoughts

Posted by Doug on Monday, October 27, 2008

In late August, it’s not uncommon for me to feel a little overwhelmed. I’m bracing for the football season and the associated maintenance of this site, and of course my real job starts to get serious right around that time as well.

So it’s also not uncommon around that time of year for me to try to cut out unnecessary time sinks. This year, I decided I would cut out watching college football. I could still follow the scores. I just wouldn’t watch it. Think of the time savings!

Two weeks later I somehow found myself cussing at the TV because of an inexcusable pass interference call in the closing minutes of the Ole Miss / Wake Forest game.

So much for that plan.

I’m glad I was unable to give it up, because it’s been a fun season. As usual, the BCS is full of intrigue and there are lots of possibilities. I think I have it figured out though.

Doug’s sure-fire BCS algorithm

[NOTE: this is not some sarcastic algorithm. I really believe it will tell you exactly what needs to happen for your team to make the title game. This takes into account both the computer rankings and what I think the pollsters will do.]

1. The following teams are still alive: Texas, Alabama, Penn State, Oklahoma, USC, Georgia, Texas Tech, Florida, Oklahoma State. Everyone else is out.

2. At the end of the season, define each team’s “score” to be the number of losses they have.

3. Subtract half a loss for winning the Big XII Championship game or the SEC Championship game.

4. Line them up and take the two teams with the fewest “adjusted losses.”

4b. In the (likely) event of a tie for first or second, the tiebreaker order is as follows: Texas, Georgia, Florida, Alabama, OU, USC, Penn State, Oklahoma State, Texas Tech.

That’s it.

So that means I believe the following:

1. Penn State will go if it is one of two unbeaten BCS conference teams, and will not go if it’s not.

2. If Penn State loses a game, and Texas survives the regular season unbeaten, the Horns will go to the title game even if they lose the Big XII championship game.

3. If Penn State does not lose a game, Texas will not go to the title game if the SEC champ has one loss (or is unbeaten).


Go West! (and then go West again)

Posted by JKL on Tuesday, October 14, 2008

This season, thanks to the schedule rotation adopted in 2002, both the New England Patriots and New York Jets play four games on the West Coast, against Seattle, San Fransisco, Oakland and San Diego. Until this year, neither has played more than two regular season road games in the Mountain and Pacific Time Zones since the merger. Over the previous four seasons combined, the Patriots have played two regular season games and three post-season games (including last year’s Super Bowl in Arizona) out West, while the Jets have played three regular season games and one post-season game.

How rare is it for an Eastern Time Zone team to play this many games out West in a single season? As it turns out, pretty rare. My research has found thirteen individual seasons when an Eastern team has played four or more regular season games out West. For my purposes, I’ll define West as both the Pacific and Mountain Time Zones, so I will include Denver and thus not have to figure out if Arizona was or was not on the same time schedule as the California teams due to daylight savings. Before the merger of the AFL and NFL, it was theoretically impossible for an Eastern team to play four games in the West (though we’ll find out below it did happen once before). Here are the teams that have travelled West four or more times in a single regular season since the AFL-NFL merger:

===================================
1979 Atlanta Falcons (1-3 in West, 6-10 overall)
1981 Cleveland Browns (1-3 in West, 5-11 overall)
1988 Atlanta Falcons (2-2 in West, 5-11 overall)
1989 New York Giants (3-2 in West, 12-4 overall)
1990 Cincinnati Bengals (2-2 in West, 9-7 overall)*
1991 Atlanta Falcons (3-1 in West, 10-6 overall)
1992 New York Giants (0-4 in West, 6-10 overall)
1994 Atlanta Falcons (1-3 in West, 7-9 overall)
1994 Cincinnati Bengals (1-3 in West, 3-13 overall)
1994 Pittsburgh Steelers (1-3 in West, 12-4 overall)
1997 Atlanta Falcons (2-2 in West, 7-9 overall)
1998 New York Giants (2-2 in West, 8-8 overall)
2005 New York Giants (2-2 in West, 11-5 overall)
========================================

*also lost playoff game at Los Angeles Raiders in 1990 divisional round

Two other teams, the 1987 Cleveland Browns, who lost in the AFC Championship game in Denver, and the 1990 New York Giants, who won at San Fransisco in the championship game before winning the Super Bowl, played three regular season games in the West in addition to the fourth game in the conference championship.

It should be no surprise, if you recall that Atlanta played in the NFC West, that they would appear on this list five times. What is surprising is that the New York Giants did it four times, while two other East Coast teams from the same division, the Philadelphia Eagles and Washington Redskins, have never played that many out West.

The sample size of teams here is so low that there is not much meaningful analysis that I can give you as to whether the cumulative effect of this much additional travel in a single season for an East Coast team matters. I went ahead and looked at the simple rating system numbers for each team in the year before and after the extensive Western travelling, compared to the year in question. In the seasons before and after, our Eastern teams had an average SRS of -1.0. In the season in question, the average SRS was -1.3. The Eastern teams performed about -4.0 points worse than their overall SRS in the Western games, which is not that much different from the generally expected 3 points for home field advantage. (I measured this by taking the end of season SRS for the road Eastern team minus the home team, then comparing the actual results versus the expected results from the SRS differences). There was no real pattern to performing worse or better as the season went on, as a whole, as the first game played out West showed the worst score (relative to season SRS) and the fourth game was the second worst.

Of course, New England isn’t just playing four games out West. They just concluded back to back games on the West Coast against San Fransisco and San Diego, and stayed at San Jose in between games to practice rather than travel back East. Later this year, they will also play Seattle and Oakland in back to back weeks. This will mark the first time in the history of the NFL that an Eastern team has played consecutive Western games on two separate occasions within the same season.

Extended trips to the West Coast were not unusual for the old AFL teams, the Boston Patriots and the New York Titans/Jets (as well as the Buffalo Bills). In the old AFL, at least prior to Miami and Cincinnati joining, every team played all other league members on a home and home basis. The East Coast teams usually played two, or sometimes even three consecutive games on the road against Oakland, San Diego and Denver. The worst travel start for a team in the history of the AFL/NFL has to belong to the 1967 Boston Patriots, and it was because the team played their home games at Fenway Park. The Patriots opened the season with three consecutive losses on the road at Denver, San Diego, and Oakland. They returned East and won at Buffalo on September 24. The Patriots were scheduled to play their first home game against San Diego. However, the Boston Red Sox won the American League pennant for the first time since 1946, and advanced to play the Saint Louis Cardinals in the World Series. The Patriots lost out to the primary tenants. Even though the Series opened in Boston, and moved to Saint Louis the weekend of October 7th and 8th, the Patriots moved their game with the Chargers back to the West, playing a second game in San Diego. That game ended in a 31-31 tie, which happens to be the last time the Patriots franchise played in a game that ended in a tie.

So how have other teams done when they have played back to back games in the West? I found thirty occasions where an Eastern team has played a game in the West, then returned West to play again a week later (including the post-season). Based on Sunday night’s game between the Chargers and Patriots, you might guess it had a big impact. It is a true factual statement (using the SRS differences to account for relative strength of opponents) that those teams collectively performed worse in the second consecutive game on the West Coast than they did in the first, applying the SRS differences for each team in the matchups compared to the actual scores. The Eastern road team was better (relative to their performance in the first game on the road trip) 11 times, worse 18 times, and about the same once.

That said, it’s not so much that the Eastern teams played really badly in the second game. It’s that they played REALLY well, as a group, in the first. Excluding the New England-San Diego game, since I don’t have end of year SRS numbers, the average result in the first game was +0.97 points better than expected, without accounting for home field advantage and the fact the Eastern team was on the road. The average result in the second game was, in contrast, -2.16 worse than expected without accounting for home field, which is not a bad performance for a road team, regardless of where the game is played. I don’t see any strong evidence that the performance in the second game was worse than what should be expected for a road team if we had no knowledge of where they played the week before. So New England may be the first team to play back to back games on the West Coast at two different times in the same season, but I don’t see any reason to think this is a competitive disadvantage compared to, say, the way the Jets’ trips to the West are spaced this season.


Win probability in football

Posted by Doug on Monday, October 13, 2008

Though it was introduced as early as forty years ago, baseball analysis had something of a win probability revolution in the late 90s. That’s how long it took for the availability of play-by-play data to increase and the price of computers to decrease sufficiently for the everyday sabermetricians to be able to really have some fun with it. As with most things in this area, football is a little behind. But it seems that we are now to a point where win probability is becoming a fairly standard part of the toolbox.

  • Carroll, Palmer, and Thorn tantalizingly mentioned it in The Hidden Game of Football, but didn’t go very far with the idea (I’m guessing because of a lack of good data).
  • The site footballcommentary.com has been around for at least a few years now and has a well-developed and well-explained (and open-source!) win probability computing model.
  • Brian Burke at Advanced NFL stats has recently built a model that differs from the footballcommentary model in some key ways, and has taken the extra step of building a real-time win probability scoreboard that you can use to use to follow the games each Sunday.
  • I have no idea how long this site has been around, but I recently stumbled across Gridironmine.com, which has a ton of nifty win probability-related discussion, charts, and tools.

I aim to make a very modest contribution with this post.

In particular, I ran a quick logistic regression to turn the basic idea of Friday’s post into a win probability equation. My model differs from those listed above in many ways. It’s not nearly as precise or as detailed (and hence generally not as useful), and it probably should be totally thrown out the window once you get into the fourth quarter, but it does include one factor that the others don’t: the relative quality of the two teams.

All the models above have built in the assumption that two average teams are facing off (and there are good reasons to do that). In the week four game between the Bills and the Rams, for instance, the Rams held a 14-6 lead early in the second quarter. Immediately following the Rams’ second TD, the the Gridironmine model said the Rams had a 75% chance of winning. But I doubt you’d be able to find too many people willing to give you even 2-to-1 odds on the Bills at that point, much less 3-to-1. My model says the game was roughly a toss-up at that point.

Here it is:

M = the current point margin in the game.

Q = quality difference between the two teams, in points.

H = 1 if at home, 0 if on the road.

In the first quarter

WinProb =~ 1 / (1 + exp(-(-.415 + .107*M + .140*Q + .710*H)))

In the second quarter

WinProb =~ 1 / (1 + exp(-(-.355 + .117*M + .127*Q + .635*H)))

In the third quarter

WinProb =~ 1 / (1 + exp(-(-.347 + .159*M + .104*Q + .569*H)))

Notes:

1. This model does indicate different break-even points for different kinds of teams in the same situation. For example, let’s say you’re trailing 7-0 in the first quarter and you have a 4th-and-goal at the two yard line. Assuming the field goal is a guaranteed make, I calculate a break-even point — that is the probability of scoring a TD that would make going for it a good gamble — at about 46% if you are seven points better than your opponent and playing at home, compared to a break-even point of 37% if you are seven points worse than your opponent and playing on the road. In other words, if you think you have a 40% chance of scoring the TD, you should go for it if you’re the underdog, but take the sure field goal if you’re a strong favorite. This makes sense. High variance strategies, like going for TDs instead of field goals, generally are a better play for underdogs than for favorites. This is just another example of that. [NOTE: calculating the partial derivatives of the break-even probabilities with respect to M and Q might be kind of fun.]

2. The regression was based on all scoring plays in all NFL games since 1978. Therefore, it should be interpreted as the win probability of a team that just scored. For example, if a home team which is three points better than its opponent takes the opening kickoff and marches down the field to take a 7-0 lead, this model would give them an 81% chance of winning the game. If they force a punt on the opponent’s ensuing possession, that obviously changes the win probability, but it doesn’t change the model. So again, this model lacks a lot of detail that the other models include, such as down, distance, field position, and exact clock time. [NOTE: I had to take some liberties with the above interpretation in order to calculate the break-even probabilities in the bullet above.]

3. I didn’t include the fourth-quarter equation because, as I mentioned, the endgame strategical decisions probably render it moot. More precisely, I guess I think it’s fair to assume that the win probability for a given score and a given pair of teams wouldn’t be too much different at the beginning of the second quarter versus the end of the second quarter. In the fourth, that’s not true. So I wouldn’t take it too seriously, but just for completeness, here it is: WinProb =~ 1 / (1 + exp(-(-.174 + .257*M + .073*Q + .348*H)))

4. Have I mentioned that this model is NOT better than any of the three I mentioned above? Well it’s not. Just wanted to make sure I was clear about that.

5. For the Q variable, I used the teams’ full-season SRS ratings for the given season. This is somewhat problematic because the SRS includes the results of the game being played. It’s also problematic if we want to compute a win probability in the Bills/Rams game of week 4, because we don’t know the final 2008 SRS ratings for the Bills and Rams. If the goal is at-the-time win probabilities (frankly, I’m not exactly sure what the goal is at this point, except having some fun with data), I probably should rebuild the model with at-the-time SRS ratings instead of end-of-season SRS ratings.

6. I should put this disclaimer on every post I make that involves a regression: unless you have years and years of postgraduate work specific to regression, it’s very hard to be sure that the data you just ran a regression on satisfies the conditions it needs to satisfy for you to draw the conclusions you’d like to draw from it. There are all sorts of rare diseases from which data can suffer. In terms of diagnosing those diseases, I’m probably the equivalent of a first-year med student.


Courtney Taylor, fantasy superstar

Posted by Doug on Saturday, October 11, 2008

As you probably know, I am affiliated not just with pro-football-reference.com but also with a fantasy football info site called footballguys.com.

Although I do think every reader of this blog who plays fantasy football will get their money’s worth and more from a footballguys subscription, this post isn’t a commercial for footballguys. And despite how it starts out, this isn’t a post about fantasy football either. It’s a post about correlation vs. causation, about process vs. results.

Every year, footballguys has a contest (with $35,000 cash awarded, I might add) for all subscribers. It works like this:

1. In early August, prices are set for about 250 NFL players. The prices don’t change, nor are players added or deleted from the pool.

2. Each contestant has $250 to spend on a roster of 22 players. That’s your team for the whole year. Set it and forget it. Except you can’t forget it because it’s so darn fun to follow.

3. Every week, the scores of your top QB, top two RBs, top three WRs, top remaining RB/WR/TE, top kicker, and top defense are added up, and that’s your score for the week.

4. Every week, the bottom thousand or so teams are eliminated from contention, never to return. By week 13, there are 250 contestants still alive, and the team that scores the most combined points in weeks 14, 15, and 16 among those teams is the grand prize winner.

There’s a little more to it than that, but that’s the main idea. Here are the full rules if you’re interested.

At first glance, there doesn’t appear to be a lot of strategy involved, but there is much more than meets the eye. I can’t even begin to describe how interesting the full contest database is as a playground for probability concepts.

This post investigates whether or not Seattle wide receiver Courtney Taylor was a good pick in this contest. Taylor had 5 catches for 50 yards in the first three weeks and then was released. He cost only $3, but that’s a similar price to several alternatives who have put up much better numbers: Antonio Bryant, DeSean Jackson, Steve Breaston, Muhsin Muhammad, Brandon Lloyd, Devery Henderson, etc. Courtney Taylor was a bad choice.

One way to determine who the most valuable NFL performers have been in this contest is to see what percentage of that player’s original owners are still alive in the contest. Right now, after five weeks, 48.4% of all the contest teams are still alive. So it’s fair to say that a player has been valuable if more than 48.4% of his owners are still alive.

24% of Tom Brady’s owners are still alive.
59% of Jay Cutler’s owners are still alive.

This is pretty obvious.

64% of Steve Slaton’s owners are still alive.
36% of Adrian Peterson’s owners are still alive.

Even though Slaton and Peterson have similar fantasy point totals, Slaton’s owners are doing far better. Why? The main reason is opportunity cost. Slaton cost a buck and Peterson cost $53. The non-Slaton part of a given Slaton owner’s team figures to be much better than the non-Peterson part of a given Peterson owner’s team.

Given this criteria, the most valuable wide receivers in this contest have been:

1. Larry Fitzgerald
2. Brandon Marshall
3. DeSean Jackson
4. Greg Jennings

These all make sense. DeSean Jackson doesn’t have the numbers of the other three, but he only cost $4. And then we have:

5. Courtney Taylor.

At the beginning of the contest, 555 contestants selected Taylor. 353 are still alive, which is 63.6%. Given the sample size, there is less than a one-in-a-million shot of this kind of split being caused by random chance.

Yes, Taylor was cheap ($3). But there are plenty of similarly-priced WRs who have performed better. And there are other examples of this. Derek Hagan’s owners are doing better than the contest average. So are Jason Hill’s, despite the fact that Hill hasn’t caught a pass all year! Even contestants who selected Hagan and Courtney Taylor are surviving at a much higher rate than the overall population despite wasting two roster spots.

In keeping with the theme of this post from last Thursday, this is another fine example of the distinction between correlation and causation. Taylor certainly hasn’t been causing his owners to stay alive. But Taylor ownership is unquestionably linked with contest success. So there must be some other variable lurking in the background that is causing both Taylor ownership and success.

That variable is probably something like “paying attention.” On August 1, when the prices were set, Taylor looked like Seattle’s 4th receiver at best. Between then and the deadline for submitting entries, Bobby Engram got hurt, Deion Branch lost any hope he once had of being ready for week one. So Taylor was a starting wide receiver (at least for the first few weeks) who could be had at a low price. That’s a smart guy to pick up. It didn’t work out, but it signals a generally smart contestant.

Here’s something more specific: Courtney Taylor ownership is highly correlated with Brandon Marshall ownership. 46% of Taylor owners also own Brandon Marshall. Only 26% of non-Taylor owners own Marshall. That makes perfect sense; grab Taylor to help get you through Marshall’s suspension. Owners who are thinking along those lines are owners who are making generally smart moves with the rest of their roster. That’s why they are succeeding.

I’m reminded of Moneyball, where Billy Beane talks about process versus results. Often, results are influenced by a variety of factors that are outside your control. If you judge your process solely by the results it generates, you’re not doing a very good job of evaluating yourself in most cases. Given the unpredictability of player performances, it just doesn’t make sense to say, “since Courtney Taylor hasn’t produced good numbers, he was not a smart pick.” Regardless of the results he produced, we have strong evidence that Courtney Taylor was a smart pick, that his selection is the result of a good process that didn’t happen to work out in this particular case.

The process/results distinction often comes into focus when writers, commentators, and fans talk about fourth down plays and other strategical decisions. It worked, so it was the right choice. It didn’t work, so it was a bad choice. The result was X, so therefore the process was X. We just don’t get enough fourth-down plays in a season (or a decade of seasons) to be able to judge the fourth-down decision process based on fourth-down results. That’s why we need analysis like Romer’s. That’s why, while it’s not necessarily proof, it is acceptable to argue along the lines of, “because Coach X goes for it on fourth down more than most coaches, and Coach X also has more overall success than most coaches, going for it more often on fourth might be a good idea.” Again, that probably can’t be the entire argument, but it does count in my mind.


More on scoring first and winning

Posted by Doug on Friday, October 10, 2008

Yesterday’s post and the ensuing discussion inspired me to whip up this little table.

What it shows is the historical win probability of a team that takes a 7-0 lead in the first quarter. It is based on all games since 1978. The twist is that I’ve grouped the games according to the strength difference between the two teams.

For example, the Eagles and 49ers are playing this weekend and Vegas says that, all things considered, the Eagles are 5 points better and hence have about a 69% chance of winning. Roughly speaking, the table below (the ‘-5′ line in particular) says if the 49ers score a first-quarter TD to take a 7-0 lead, then the 49ers have about a 50/50 shot of winning. If, on the other hand, the Eagles strike first with a first quarter TD, the ‘+5′ line indicates that the Eagles would then have about a 74% chance of winning.

I don’t have historical Vegas lines, so I estimated the difference in team quality using the teams’ Simple Rating System ratings, with 3 points added to the home team’s.

+--------------+--------+---------+--------+
| Quality_diff | number | win_pct | margin |
+--------------+--------+---------+--------+
|          -16 |     20 | 0.14    | -12.5  |
|          -15 |     30 | 0.24    | -10.0  |
|          -14 |     20 | 0.23    | -5.1   |
|          -13 |     26 | 0.24    | -7.0   |
|          -12 |     36 | 0.20    | -7.0   |
|          -11 |     42 | 0.26    | -3.1   |
|          -10 |     54 | 0.18    | -6.4   |
|           -9 |     67 | 0.31    | -5.3   |
|           -8 |     74 | 0.34    | -3.2   |
|           -7 |     88 | 0.40    | -1.9   |
|           -6 |     87 | 0.41    | -0.9   |
|           -5 |    101 | 0.50    | 2.0    |
|           -4 |    131 | 0.50    | 0.1    |
|           -3 |    130 | 0.53    | 3.0    |
|           -2 |    164 | 0.54    | 3.1    |
|           -1 |    134 | 0.53    | 2.9    |
|            0 |    137 | 0.66    | 6.4    |
|            1 |    137 | 0.67    | 6.2    |
|            2 |    166 | 0.63    | 5.3    |
|            3 |    160 | 0.62    | 7.5    |
|            4 |    169 | 0.66    | 7.2    |
|            5 |    149 | 0.74    | 12.3   |
|            6 |    154 | 0.76    | 10.7   |
|            7 |    148 | 0.78    | 13.2   |
|            8 |    136 | 0.77    | 12.5   |
|            9 |    141 | 0.82    | 13.5   |
|           10 |     88 | 0.78    | 13.7   |
|           11 |    106 | 0.82    | 15.2   |
|           12 |     95 | 0.84    | 16.0   |
|           13 |     77 | 0.83    | 17.1   |
|           14 |     72 | 0.86    | 19.5   |
|           15 |     65 | 0.83    | 17.5   |
|           16 |     45 | 0.86    | 18.0   |
|           17 |     40 | 0.90    | 18.6   |
|           18 |     32 | 0.87    | 21.6   |
|           19 |     31 | 0.90    | 22.7   |
|           20 |     27 | 0.90    | 18.9   |
+--------------+--------+---------+--------+

The discussion could go a lot of different ways from here, but I’m short on time, so I’ll just invite discussion in the comments.

I’ll close with the corresponding chart for a first-quarter field goal to take a 3-0 lead:

+--------------+--------+---------+--------+
| Quality_diff | number | win_pct | margin |
+--------------+--------+---------+--------+
|          -14 |     20 | 0.18    | -10.1  |
|          -13 |     23 | 0.16    | -8.5   |
|          -12 |     35 | 0.22    | -7.6   |
|          -11 |     47 | 0.19    | -8.2   |
|          -10 |     54 | 0.27    | -7.4   |
|           -9 |     57 | 0.22    | -5.7   |
|           -8 |     57 | 0.32    | -4.8   |
|           -7 |     78 | 0.29    | -4.3   |
|           -6 |     77 | 0.37    | -3.3   |
|           -5 |     75 | 0.32    | -2.4   |
|           -4 |     87 | 0.40    | -2.3   |
|           -3 |     85 | 0.43    | -1.3   |
|           -2 |    105 | 0.39    | -1.8   |
|           -1 |    102 | 0.37    | -2.5   |
|            0 |    109 | 0.48    | 2.1    |
|            1 |    125 | 0.54    | 1.7    |
|            2 |    108 | 0.61    | 3.5    |
|            3 |    106 | 0.57    | 3.8    |
|            4 |     99 | 0.65    | 4.9    |
|            5 |     79 | 0.63    | 6.1    |
|            6 |    106 | 0.68    | 7.3    |
|            7 |     84 | 0.72    | 9.2    |
|            8 |    102 | 0.71    | 9.5    |
|            9 |     74 | 0.68    | 7.7    |
|           10 |     71 | 0.72    | 9.1    |
|           11 |     51 | 0.72    | 11.9   |
|           12 |     36 | 0.68    | 7.5    |
|           13 |     40 | 0.79    | 14.8   |
|           14 |     40 | 0.83    | 16.7   |
|           15 |     35 | 0.87    | 16.2   |
|           16 |     20 | 0.86    | 16.3   |
+--------------+--------+---------+--------+

Why 13>14, how Devin Hester cost the Bears Super Bowl XLI, and other mysteries

Posted by Doug on Thursday, October 9, 2008

For reasons unknown to me, my referrals logs last week showed several clicks from this Phil Birnbaum post about this nearly-two-year-old Chase Stuart post about the odd fact that NFL teams win more often when they score 13 points than when they score 14.

It turns out that, in the aggregate, teams allow fewer points when they score 13 than when they score 14.

Commenter Alex said this:

It’s not that scoring two field goals instead of a touchdown causes a team to win, it’s that being in the lead causes a team to score two field goals instead of a touchdown.

To which I added this:

It didn’t occur to me, but now that you phrase it that way, it’s just another version of the old “Dallas is 97-2 when Emmitt rushes the ball 30 or more times” schtick.

Nobody runs the ball when they’re trailing big, and nobody kicks field goals when they’re trailing big.

Now that we’ve got detailed scoring logs, I can investigate this a little further. Consider this:

Since 1970, teams that score their 13th point in the second half find themselves either tied or in the lead 58% of the time after scoring that 13th point.

Teams that score their 14th point in the second half find themselves either tied or in the lead only 48% of the time after scoring that 14th point.

In the fourth quarter, those numbers are 52% and 35%.

[Obviously, I'm ignoring the brief instant between a TD and a successful PAT in both calculations.]

Or consider this:

A second-half field goal either ties the game or gives the team the lead 75% of the time. A second-half TD either ties the game or gives the team the lead only 69% of the time. That’s a slim margin, but it’s real; these are huge sample sizes. In the fourth quarter, those figures are 81% and 66%.

I think this is pretty solid evidence that scoring 13 points is an effect, rather than a cause, of winning games.

While I was wading around in the database, I scraped up a few other fun facts.

Here’s another one from the correlation-is-not-causation file…

Since 1978, teams that have a kickoff return for a touchdown win 52% of the time, but teams that have a punt return for a touchdown win 69% of the time. Teams that have an interception return for a touchdown win 76% of the time. Teams that have a fourth-quarter interception return TD win 90% of the time.

Why? Because teams that are returning kickoffs are teams that were just scored upon (usually), while teams that are returning punts are teams that just played some good defense. Passes that turn into interception return TDs are often either (1) risky passes or (2) passes thrown by poor quarterbacks or into very good secondaries. Risky passes — especially in the fourth quarter — are far more likely to be thrown by teams that were already losing. Teams with bad quarterbacks are also more likely to be losing. So it’s not that pick-sixes are worth more than kick returns; it’s just that they happen to teams that are richer.

Now here’s one that I don’t understand…

Since 1950, there have been 75 opening kickoffs returned for TDs. Only 46.8% 53% of those teams won the game. Contrast that with the 63% win percentage for teams who score a 1st quarter non-kickoff-return TD to take a 6-0, 7-0, or 8-0 lead.

EDIT: I originally erroneously posted 47%, but I was reading it backwards. 53% is the correct number.

When I presented this to Chase, he pointed out that an opening kickoff return TD is worth about 5.7 points (prior to the kick the receiving team has about a 0.7-point expectation, the TD is worth 6.4, so the net is 5.7). So if the two teams were equally matched, Chase reasons, an opening kickoff return score essentially makes the scoring team a 5.5- or 6-point favorite, and those kinds of teams win 70–75% of the time. So why did these teams win only 47% 53% of the time?

My first thought was that these 75 particular teams might have just happened to be weak teams. This might be the result of random chance, or it might be due to a general tendency for weaker teams to have stronger kick return units (I have no idea if that tendency exists, but it is plausible: weaker teams have more incentive to have good kick return units, and they get more practice!). But as far as I can tell, that wasn’t the case. These 75 teams had an overall winning percentage, in all games, of 49.5%.

Maybe a game-by-game look at all 75 games would reveal something, but before I do that I’ll open the question to you:

Does returning the opening kickoff for a TD hurt your chances of winning?

I don’t think so, but I can’t find the easy explanation for this phenomenon.

EDIT: even after the correction, it’s still an interesting bit of data, but quite a bit less mysterious. It is almost certainly due to random chance, plus a collection of factors mentioned in the comments.


Yards per Reception: “Worst” WRs ever

Posted by Chase Stuart on Tuesday, October 7, 2008

Yesterday, I looked at the WRs with the highest yards per reception ratios (compared to league average) in NFL history. Today I’m going to look at the “worst” ones, with that word in quotes because I’m not really sure a low YPR is bad.

I’ll use the same three methods employed yesterday to rank the receivers again. The first involves taking the difference between the league-wide yards per reception rate (after subtracting out the individual WR’s own numbers) and the WR’ s yards per reception ratio, and multiplying that difference by the number of receptions. So if the league average YPR was 15.00, and a WR had 100 receptions for 1400 yards, he’d have a value of -100 — he averaged 14 YPR, 1.00 below 15 YPR, over 100 receptions. Similarly, if he had 50 receptions for 650 yards, he’d be 100 yards below average as well. He’d have a 13.00 YPR average, which is 2.00 below 15.00, for 50 receptions.

Here’s a list of the 50 WRs with the “worst” YPR seasons of all time:

(Continued)


Yards per Reception: Best WRs ever

Posted by Chase Stuart on Monday, October 6, 2008

I never know what to make of the yards per reception statistic. Theoretically, I believe that WRs with low YPR are probably more valuable, but practically, it seems that WRs with high YPR are the best ones. YPR is inherently misleading since it has a good thing in the numerator and a good thing in the denominator. A reception is good; yards are good; so why are we dividing these things? A three yard reception on third-and-two lowers a player’s yard per reception average. That said, even if I don’t know what to do with it, I know how to create a list of the best YPR guys in NFL history.

As we all know, WRs have seen their yards-per-reception ratio decrease significantly over the past couple of decades. Here’s a list of the league-wide yards per reception average for all wide receivers in the league:

(Continued)


The final value of a passing touchdown

Posted by Chase Stuart on Friday, October 3, 2008

This week, I’ve been writing about the value of touchdowns scored on different downs. One of the problems with the data, though, is that the sample size isn’t very large for each down. There is some down-to-down variation that exists in the data that doesn’t make a lot of sense, and is probably a result of a small sample size. Further, I think the numbers should be consistent — your odds of getting a 1st and goal from the one should be the same as your odds of getting a 4th and goal at the one (and in fact, the numbers say they pretty much are). Your odds of fumbling should be the same, too.

One of the problems with the results from Wednesday’s post is that a first down touchdown is undervalued — that is, it says the situation of being in 2nd and goal from the one is worth 5.50 points. But according to theory that we’re very confident in, 1st and goal from the one is worth 5.55 points.There should be a greater spread than that. So here’s what I did.

For every play at the one yard line (designated in the table below as 1st, 2nd, 3rd or 4th and goal), a team has a 55% chance of scoring a touchdown, a 2% chance of a turnover or an interception, a 12% chance of a loss of yards, and a 29% chance of no gain. A touchdown will be worth 6.4 points, a fumble is worth 0.9, and an interception -0.25. A play of “no gain” is worth whatever the next sitaution is worth. So no gain on 2nd and goal at the one is worth 4.746, since that’s the value of having 3rd and goal at the one. A loss is worth slightly less than 75% of a play for “no gain”. For fourth down, there’s considered a 50% chance of a field goal and every other situation is reduced by 50%. The table below sums this up:

	1ST + G				2ND + G
TD 0.55 6.4 3.52 0.55 6.4 3.52
FUM 0.02 0.9 0.02 0.02 0.9 0.02
INT 0.02 -0.3 -0.01 0.02 -0.3 -0.01
FG 0.00 2.4 0.00 0.00 2.4 0.00
LOSS 0.12 3.9 0.47 0.12 3.5 0.42
NO GAIN 0.29 5.3 1.55 0.29 4.7 1.38
5.55 5.33

3RD + G 4TH + G
TD 0.55 6.4 3.52 0.275 6.4 1.76
FUM 0.02 0.9 0.02 0.01 0.9 0.01
INT 0.02 -0.3 -0.01 0.01 -0.3 0.00
FG 0.00 2.4 0.00 0.50 2.4 1.20
LOSS 0.12 2.4 0.28 0.06 1.2 0.07
NO GAIN 0.29 3.2 0.93 0.145 1.2 0.17
4.75 3.20

That 5.55 number reflects the value of 1st and goal from the one yard line — which is what our theory predicts. So now a touchdown on a long bomb is worth 0.85 extra points (6.4 - 5.55), a touchdown on 1st and goal is worth 1.07 extra points, on 2nd and goal is worth 1.65 extra points, on 3rd and goal is worth 3.20 points and on 4th and goal is worth, still, 4.85 extra points. Using the numbers from Wednesday’s post, this means the average passing touchdown is worth 1.325 extra points. If we convert that number to yards, that would mean each passing touchdown, on average, is worth 18.3 extra yards.

However, it’s slightly more complicated than that. Sure we know all passing yards aren’t equal — but leaguewide, passing yards aren’t evenly distributed. A pass from the 45 to the 50 isn’t as valuable as one from the 5 to the end zone; we know that. But it’s also true that the former pass happens very, very often, and the latter is relatively rare. In other words, lots of the passing yards that QBs get are of the less than average value variety. And if that’s the case, than the average passing yard isn’t as valuable as the yards on the field. And if that’s the case, then a passing touchdown is even more valuable than we thought.

Doug looked at every passing play in 2007 that gained at least one yard. Then he individually looked at each yard it covered, computed the value of that yard, and added that value to a giant running total. If you divide that total by the total number of yards and you should get the league wide value of a typical passing yard. Remember 98 yards is worth 7.1 points, meaning 1 yard is worth 0.724 points. Well, according to Doug, the average passing yard is worth .0653 points and that one point is worth 15.3 “average passing yards”. So we should be multiplying 15.3 times the 1.325 points the average passing touchdown is worth. In other words, a touchdown is worth 20.3 passing yards.

Except for one more point. As Vince pointed out on Wednesday, we should also be subtracting one yard from our total. Since we’re measuring the point value from the 1 into the end zone, we should then subtract out one yard at the very end. So, for the last time for awhile, I’m going with 19.3 yards as the ***official*** value of a passing touchdown. It’s still worth remembering, though, that a generic touchdown on a non-X-and-goal play, is still merely worth only 10.7 yards, or 13.0 “average passing yards”.


The SEC and Big XII schedules are weird. The ACC’s slightly less so.

Posted by Doug on Thursday, October 2, 2008

Yesterday Chase proposed a fairly major modification of the Adjusted Yards Per Attempt statistic that is the basis of his various passer rating formulae. In particular, he is proposing to increase the TD bonus from 10 yards to 17 or 18. I intend to argue in a future post that that number should be 20.

But that has to wait, because I am fascinated by a bizarre and obscure topic that no one else cares about, and I have to tell you all about it….

I’m sure there are bigger schedule-o-philes in the world than me, but I’m generally in tune with how sports schedules operate and what their consequences are. So I’m ashamed to say that the SEC and Big XII have existed in their current two-division, 12-team format for more than a decade now, but I’ve never sat down and worked out how the interdivisional schedules work in those two conferences.

I did so this week, and was surprised by what I found. The main point of this post is to try to find people who have explanations for why these schedules are structured as they are.

For the sake of definiteness, I’ll talk about the 2008 Big XII regular season schedule. With the appropriate mapping of teams to teams, the 2008 SEC schedule is identical. The 2008 ACC schedule is different; I’ll talk about that later. I haven’t checked earlier years to see if the 2008 system is the standard or if the schedule structure varies.

The Basics

The Big XII is divided into two divisions (called North and South) of six teams each. Each team plays every team in its own division once. Each team plays three of the six teams in the other division. For the purposes of this post, the intradivisional games are uninteresting. What I’m interested in is how to decide which three North teams a given South team will play.

Three natural ways to construct the schedule

1. All members of the Big XII North hold hands and stand in a circle. All members of the South hold hands and stand in a circle just inside the North circle. Align the circles so that a South team is standing directly inside each North team. That’s your first opponent. Now leave the North circle stationary and rotate the inner circle 60 degrees. You’re now standing next to your second opponent. Rotate again and you’ve got your third opponent. Done.

2. In each division, form two subdivisions of three teams each. South A plays North A and South B plays North B.

3. In each division, form three subdivisions of two teams each. South A plays North A, South B plays North B, and South C plays North C. Now everyone needs one more game. So each South A team could play a North B team, each South B team could play a North C team, and each South C team could play a North A team.

These three methods popped into my head immediately and, even after some thought, no other method did. So before I looked at the schedule, I figured it had to be one of these three.

It wasn’t.

Here’s how they do it

They divide the South into three subdivisions and the North into two. In the South, here is what we have:

Big XII South Division A: OU, Texas Tech
Big XII South Division B: Oklahoma State
Big XII South Division C: Texas, Baylor, Texas A&M

In the North, we have:

Big XII North Division A: Kansas, K-State, Nebraska
Big XII North Division B: Mizzou, Iowa State, Colorado

And the schedule is constructed as follows:

Everyone in South A plays everyone in North A
Everyone in South B plays everyone in North B
Everyone in South C plays one team from North A and two teams from North B

This strikes me as much more complicated than the three methods I outlined above, but what I find particularly strange is that it’s not symmetric. The North and the South are not interchangeable. I guess there’s no reason why they have to be, but don’t humans naturally tend toward symmetry in their designs when possible?

The ACC’s schedule is symmetric. Before I tell you about it, I’ll go on a mini-rant about The New ACC in general: IF I LIVE TO BE A THOUSAND YEARS OLD, I WILL STILL NOT BE ABLE TO FIGURE OUT WHO IS IN WHICH DIVISION IN THE ACC. It drives me crazy. Part of the problem is a certain interchangeability and generic-ness of the teams themselves. But there also doesn’t seem to be any geographical or other basis for remembering what’s what. The divisions are named the Atlantic and the Coastal. How is that supposed to help me?

The ACC essentially breaks it down into two two-team subdivisions and two one-team subdivisions in each division. I’ll call the one-team subdivisions A and D and the two-team subdivisions B and C. With that, A plays A and B, D plays C and D. Then each team in B plays one team from B and one team from C. There may be a better way to visualize that, but it’s not equivalent to any of the three methods I outlined at the beginning of the post.

My questions and comments

1. Does anyone know anything about the history of these schedule structures? Have they always been like this or do they vary?

2. Assuming they’ve always been like this, how did they decide on these methods? The Big XII / SEC schedule strikes me as something that was either the result of a whole lot of thought or no thought at all. There may well be a good reason for it and I’m just not seeing it. If so, what’s the reason? If not, do you think they just started pairing teams up willy-nilly and this is what they ended up with?

3. Also assuming they’ve always done this, does anyone know how the teams rotate through the various subdivisions from year to year?

4. My interest in this investigation comes from issues of fairness. Among these methods, which one maximizes the probability of the best team winning the conference? Or does it not make any difference? That question is pretty high on my long-term to-do list right now.

5. I have ruled out the following as explanations for the weirdness of these schedule structures:

5a. The fact that it is necessary to alternate home games. I’m pretty sure the Big XII schedule stays the same for two years at a time, with only the locations alternating. This could be done with any schedule structure.

5b. The fact that certain interdivisional games have to take place every year. I believe Tennessee and Alabama play every year despite being in opposite divisions. Maybe Georgia and Auburn do too. Florida State and Miami? Anyway, I don’t see how this explains the structure. Regardless of the structure, a rearrangement of the teams within the structure could ensure that Tennessee and Alabama are always paired up.

5c. The fact that it is desirable (I assume) for every team to play every other team an equal number of times in the long run and/or for the mixture of visiting opponents in every city to be kept “fresh” in some sense. Again, these goals could easily be accomplished by an appropriate year-to-year shuffling of teams, regardless of the schedule structure.

6. Has anyone ever read an article about these scheduling procedures? Does anyone have any idea about someone I could contact to find out more information about how they came to be?


What’s a touchdown worth?

Posted by Chase Stuart on Wednesday, October 1, 2008

On Monday, I looked at the difference between having the ball on first down at the one yard line and scoring a touchdown. Last year, Doug explained why a touchdown was worth ten yards, at least according to The Hidden Game of Football. In this post, I’m going to try and explain what a touchdown really is worth. Later this week, either Doug or I will probably write a postscript to this series.

Every touchdown except those that come on a 1st, 2nd, 3rd or 4th and goal situation can be measured by the analysis in yesterday’s post. As long as you gain a first down on the play in question, the difference between a touchdown and the ball at the one depends solely on the distribution of results in 1st-and-goal situations. According to the post yesterday, 1st and goal from the one is worth about 0.865 touchdowns.

But what about a touchdown on 4th and goal? That’s pretty valuable. Touchdowns on 1st and goal are slightly more valuable than your run of the mill touchdown. Any touchdown on a down-and-goal situation is more valuable than a touchdown in any other situation, since one less yard in the former situation puts you in a worse position than one less yard in the latter position. To use an example, coming up an inch short from the end zone on 3rd and 10 from the 15 gives you 1st and goal from the one; coming up an inch short on 3rd and goal from the one gives you 4th and goal from the one; quite clearly, the former situation is better.

So how many touchdowns come in these X-and-goal situations? In 2007, there were 720 passing touchdowns. Of those, 442 touchdowns came on non-goal to go situations. Of the remaining 278 touchdowns that came in goal-to-go situations, 84 passes were on first down, 95 were on second down, 90 came on third down and nine came on fourth down. This means that 61.4% of passing touchdowns were of the non-goal-to-go variety, 11.7% came on 1st and goal, 13.2% came on 2nd and goal, 12.5% came on 3rd and goal and 1.25% came on 4th and goal. We already know what a passing touchdown is worth in 61.4% of the situations — what about the rest?

The fourth and goal situation is easy to analyze. If the pass goes for a touchdown, it’s worth 6.4 points. If it came up an inch short, your team would now be on defense, with the opponent facing 1st and 10 from their own one yard line. According to Romer, that situation is worth -1.55 points to the offense, or +1.55 points to the defense. This makes a completion on 4th down that crosses the goal line worth 4.85 more points than one to the one inch line.

Grading the value of a third down touchdown pass is a little tricky, because it involves weighing sub-optimal decision making. Your average third-and-goal pass that brings the offense to the one-yard line results in a field goal on the next play — but that’s usually a bad decision. But because we’re dealing with averages here, we’ll simply concede that a failed play can be made worse by the likelihood of a bad coaching decision. So to get the value of a third down touchdown relative to a third down pass to the one, we need to know the value and likelihood of the alternative options on 4th-and-1: a successful field goal, a missed field goal, a successful fourth and one, and an unsuccessful fourth and one. The odds of a missed field goal from that close are small enough to be ignored, which means we can assume all field goal attempts are successful. The odds of a successful fourth and one, based on data from the past three seasons, is 0.55. To conclude, a successful field goal is worth 2.4, a successful fourth and one is worth 6.4, and an unsuccessful fourth and one is worth about +1.15 (this is because the ball is usually turned over at around the three). In 2007, twenty of forty times the coaches chose to kick the field goal instead of going for the score. Using that ratio, we can find a weighted average value of a fourth and one situation to be +3.22. Therefore, the value of fourth and goal at the one is +3.22, which is 3.18 fewer points than a touchdown.

To get the value of a second down touchdown relative to a play down to the one yard line, we need to know the value and likelihood of the alternative options on 3rd-and-1: a touchdown, a turnover, a play for no gain, or a loss of yards. We know the value of two of those things already - a touchdown is worth 6.4 and the value of a play for no gain is 3.22 — because that’s fourth and one. We can assume that a play that loses yards is simply worth 2.4, because then the team will kick a field goal. In 2007, 59 teams faced a 3rd-and-1 from the 1 and thirty-three of them scored touchdowns. Three more went for lost yardage and ensuing field goals, 21 went for no gain and fourth and one situations, and the final two were interceptions that resulted in touch backs (value of -0.25). The weighted average tells us that the value of 3rd-and-goal from the 1 is +4.84, which is 1.56 fewer points than a touchdown.

To get the value of a first down touchdown relative to a play that ends at the one, we need to know the value and likelihood of the alternative options on 2nd-and-1: a touchdown, a turnover, a play for no gain, or a loss of yards. Once again, we know the value of a touchdown and of the potential 3rd-and-1 situation. In 2007, there were 97 2nd-and-goal situations from the one, and 55 went for touchdowns (+6.4) and 28 went for no gain (and thus third and one, +4.84). Of the remaining fourteen plays, there was one fumble (+0.9), one interception (-0.25) and twelve plays that averaged into a third and two situation. For simplicity’s sake, I’m going to approximate a third and two as around +3.8. Add up all the possibilities, and this comes to a 5.50 weighted average, the value of 2nd-and-goal at the one.

Here’s a final table of the value of each touchdown pass:

                   To the 1   TD    Diff   % of TD passes
Non-goal to go     5.55       6.4   0.85   61.39
1st and goal       5.50       6.4   0.90   11.67
2nd and goal       4.84       6.4   1.56   13.19
3rd and goal       3.22       6.4   3.18   12.50
4th and goal       1.55       6.4   4.85    1.25

Simple multiplication tells us, then, that the average touchdown pass is worth 1.29 points. The majority of touchdowns are worth only 0.85 points, but enough touchdowns are worth over 3 points to bring that average up to 1.29. Now depending on who you listen to, a yard is worth about 10-15 yards. According to Romer, the 98 yards on the field (from the 1 to the 1) span from a value of -1.55 to +5.55; in other words, 98 yards are worth 7.1 points. Therefore, one point is worth about 13.8 yards, and 1.29 points are worth about 17.8 yards. This is a significant increase from the 10 yards given in The Hidden Game of Football, and depending on blog reaction (and the next post), going to be my new standard for valuing touchdown passes.

Note: Rushing touchdowns are very easy to value here, too. Why’s that? There were 386 rushing touchdowns in 2007. Of those, 130 occurred in non-goal to go situations. 117 occurred on 1st and goal, 94 on 2nd and goal, 40 on 3rd and goal and five on 4th and goal. Because the value of the situation that follows being tackled at the one is the same whether the previous play was a run or a pass, we can use the same numbers as above.

Non-goal to go     5.55       6.4   0.85   33.68
1st and goal       5.50       6.4   0.90   30.31
2nd and goal       4.84       6.4   1.56   24.35
3rd and goal       3.22       6.4   3.18   10.36
4th and goal       1.55       6.4   4.85    1.30

While the average passing touchdown was worth 1.29 points, the average rushing touchdown was worth 1.33 points. And of course, 1.33 points is equal to 18.4 yards.

It also follows from this analysis that return touchdowns by Josh Cribbs or anyone else are worth “only” 11.7 yards. That number isn’t too far from the number used by The Hidden Game of Football.


Life at the 1

Posted by Chase Stuart on Monday, September 29, 2008

What’s the difference between a touchdown and the ball at the one yard line? A touchdown is worth either 6, 6.4, 7 or potentially 8 points, depending on who you ask. Sticklers for details will tell you that a touchdown is no guarantee of a successful extra point, and is only worth six points. Most people will say that a TD is worth seven points, as teams that score touchdowns almost always come away with seven points. A touchdown is worth potentially 8 points, of course, because if you’re down by 8 you only need a touchdown to have a chance to tie.

And David Romer would tell you that a touchdown is worth 6.4 points — just like a field goal is worth 2.4 points — because following the mandatory kickoff the opposing team gets the ball at around the 27 yard line. And having the ball at around the 27 is worth about 0.6 points.

Arguments about the worth of a touchdown aside, the ball at the one is almost always going to be less valuable than a touchdown. But we know it’s not much less valuable. So, in fact, how much less valuable is it?

Ignoring the final minute of each half, teams had first down at the one yard line about 108 times in 2007. (I say about, because while my data is as close to complete and as accurate as any I know of, it’s certainly possible and even likely that I’m missing some specific plays.) What happened on those 108 “drives”?

On first down, 20 of the 108 teams threw the ball. Of those, 12 went for completions, and all twelve were touchdowns. Eight passes went incomplete, along with zero sacks and zero interceptions. Obviously 88 plays were rushes (although some may have been designed pass plays that turned into QB runs), with 38 of them going for touchdowns. Eight teams lost two, three or four yards. Eleven teams lost one yard (with one fumble lost) and 31 gained zero yards (with one holding penalty, meaning the team started at the 11 despite gaining zero yards on the rush).

To conclude, of the 108 plays in first and goal at the one situations (excluding those with one minute left in either half), 50* of them were touchdowns, 38 gained zero yards, one resulted in a fumble lost, and 19 left the teams with the ball and further away from the goal line. On second down, of the 38 plays from the one, 15 times the team ran for a touchdown and six times they threw for a score. Of 19 plays run from farther out, four times the team threw for a score and zero times the team ran for a score. One interception was thrown. That leaves 31 plays for third down.

Five teams a team threw for a score; six times a team ran for a score. Two times a team threw an interception. Once a team threw for a score, had the play nullified by a penalty, and then scored on the ensuring third down attempt. So what happened on 4th down?

On the 17 remaining plays, 11 times the team kicked a field goal and all eleven were successful. Six times the team went for it, resulting in two touchdowns and four turnovers on downs.

To sum, 108 teams last year had the ball at the opponent’s one yard line on 1st down, with more than one minute to go in the half. 89 times (82.4%) the team scored a touchdown and 11 times the team scored a field goal. Four times the team turned the ball over, and four more times the team went for it and failed on fourth down.

It’s not too difficult to value the touchdowns and the field goals. What about the turning the ball over and the failed fourth down conversions? The fumble was recovered at the four. Two of the interceptions were recovered in the end zone and downed there; one was returned to the six yard line. Obviously four isn’t a large enough sample size to feel confident about anything, but the average field position the opponent took over the ball following the turnover was the 12.5 yard line.

The turnover on downs data are probably more reliable. The defenses took over at the two, five, twelve and fourteen yard lines — on average, the 8.25 yard line. For all eight turnovers, the opposition took over at roughly the ten yard line.

We could re-look at the 2007 data as follows: 82.4% of the time teams facing 1st and goal from the one eventually score a touchdown; 10.2% of the time those teams settle for a field goal, and 7.4% of the time the defense ends up with the ball before any scores, at around the ten yard line.

Using Professor Romer’s logic, this means 82.4% of the time a team scores 6.4 points, 10.2% of the time a team scores 2.4 points, and 7.4% of the time a team scores about +0.35 points. Where’d I get that last number from? According to Romer, 1st and 10 from your own 10 yard line is worth about -0.35 points. So our offense that fails to score still puts its team in a position where it’s more likely to score than next. If we weight our averages, that means 1st and 10 at the one yard line is worth between 5.5 and 5.6 points. Since a touchdown is worth 6.4 points, this means 1st and goal at the one is about 86-87% as good as a touchdown. It’s worth noting that Professor Romer reached the same exact result. According to his graph, 1st and goal at the one is worth 5.55 points. I wasn’t sure if he was right or not, but my query today makes me feel very confident that he was.

*I’ll be discussing more plays from the one yard line tomorrow, but it’s worth noting that only 50 of the 108 rushes on 1st down scored touchdowns. That rate of 46% is pretty low — on other downs and in general 3rd or 4th and 1 situations, teams convert at around a 55% clip. I checked the 2006 data (unfortunately, the rest of the data is too cumbersome to go back to ‘05 or ‘06 at this time) and the conversion rate was only 49%. But in 2005, the conversion rate was an incredible 65%. The weighted three year average was 54%, which is in line with what you’d expect.


When should the Lions have given up on Joey Harrington?

Posted by JKL on Friday, September 19, 2008

In June, Chase Stuart wrote a series of posts about quarterbacks, including the worst quarterbacks of all-time. In that post, he had this to say about Joey Harrington:

There you have it — no QB has performed so far below the league average for so long as Joey Harrington. To be clear, Joey Harrington probably isn’t the worst quarterback of all time in an absolute sense. But in terms of being so far below average, but far enough above miserable to earn more playing time, Joey Harrington hurt his team more than any other QB in NFL history. If Harrington had been worse, he would have played less, and he wouldn’t have set back the teams he played on.

So that got me thinking. At what point should the Lions have given up on Joey Harrington? Let me define what I mean by “give up”. It could mean releasing or cutting the player, but I don’t necessarily mean waiting until that point. I more consider it the point at which the team should bring in a veteran quarterback, or another high draft pick, to legitimately compete as the starter and potentially beat out Harrington–and I don’t count Mike McMahon as doing that. It’s just hard to say all of that in a quick and easy way.

Chase opined that the reason that so many of the “bad” quarterbacks were recent high draft picks is because teams give them many opportunities to fail. I think that’s right. And I’ll go so far as to say that teams are far more likely to commit errors of holding on to a quarterback for too long, while rarely giving up on a quarterback to early–once they have seen him play any amount of time in a real NFL game. I can think of examples of quarterbacks who were drafted, never started for their original team, and found success elsewhere, but its relatively rare to find a quarterback who started but never had success with his original team, and moved elsewhere to have his first breakout.

But I think NFL teams who hold on to a bad quarterback for too long are compounding their problems, and committing a new and independent error. Drafting Joey Harrington may have been a mistake, but having him as the best quarterback on the roster, and starting him for four years, is a bigger one. All NFL teams make drafting mistakes or get unlucky, but the good teams move on quicker and do not compound their mistakes.

We can probably think of examples of young quarterbacks struggling, but was there a point at which Harrington’s career path and numbers diverged from the quarterback successes? After all, he was the primary starter for four seasons–surely the decision could have been made before then.

(Continued)


Guest post: Best Draft Classes Revisited

Posted by Doug on Wednesday, September 17, 2008

Article by frequent pfr-blog commenter Richie Wohlers. Any transcription errors are the fault of Doug.

About 5 years ago I began a project where I wanted to try and evaluate NFL draft performances over the years. I wanted to come up with a simple method for evaluating all NFL players, regardless of position. I decided that the main goal of an NFL team when it drafts a player is to draft a player who is going to play in NFL games. So I figured I could just rate all players by the number of games they played. I decided to also award bonus points for players who made Pro Bowls or the HOF.

So I began to manually enter this information for all players. I could get some information from NFL.com and some from this website. But all the data was not available for all players. Needless to say, I never got too far in my task. I completed the 1999 draft for all players I could find, and then I pretty much put my project on the backburner. Then, pro-football-reference.com added all this wonderful draft data and games played stats for basically every player since the 1950’s, I could finally finish my project. I was just about done with my research when Doug came up with his Approximate Value formula and did a research project that was similar, yet superior, to mine.

Even though my method is more of an estimate than the AV method, and even though we’ve read a couple of posts on a similar topic, I went ahead and finished my research. I pulled out a few pieces of information which are a little different than what Doug and Chase have already posted in the past few months.

(Continued)


Reports of New England’s demise are greatly exaggerated

Posted by Doug on Saturday, September 13, 2008

I’m hearing and reading a lot of crazy stuff this week.

So I just want to document my predictions that (a) the Patriots will win at least 11 games this year, (b) the Patriots will clinch the East before week 17, and (c) Matt Cassel will be a top-12 fantasy quarterback from here out.

That is all.


2008 stats

Posted by Doug on Friday, September 12, 2008

This is our first time doing weekly updates with the new format, so it has taken a bit of time to get the week one stats to the site. They should be up sometime today, hopefully by the time you read this.

In general, p-f-r will update every Monday and/or Tuesday as it always has.

Thanks for your patience.


Michael Turner and what to expect from a 200 yard rusher the week after

Posted by JKL on Thursday, September 11, 2008

I don’t think its an overstatement to say that Michael Turner had a fantastic debut as a starter for the Atlanta Falcons last week. Over on the footballguys message boards, a poster named Abstract posed this very concrete question:

I was just sitting here thinking about Michael Turner’s outstanding performance last week and it got me to thinking. Does anyone know what history says about how a guy does following his 200 yard blow up? What does he normally do the next game?

Well, that’s what we are here for. What do guys who have big rushing weeks do as an encore? The short answer is, they usually don’t run for 200 yards the next game (only one player had back to back 200 yard games since 1995, can you name him?). If your memory is clouded by the most recent occurrences, you might be tempted to guess they don’t do so well, based on Jamal Lewis and Adrian Peterson’s follow up performances (twice) last year. Neither of those guys reached 70 rushing yards the following week.

As it turns out, it was those performances that were the exception. Certainly, the numbers regress the following week after a truly exceptional performance. But if you are wondering whether Michael Turner is likely to have a pretty good performance next week, based on history, the answer is yes.

Going back to 1995, there have been fifty occasions where a running back has rushed for at least 200 yards in a regular season game. Eight of those occurred in the final week of the regular season, so we will throw those out. For the remaining 42 cases, the running back played in the next game in all of them. Here’s how they did:

As a group, they averaged 21.4 rush attempts, 94.8 rushing yards, 24.1 receiving yards, and 1.0 total touchdowns the following week. In a non-points per reception scoring format, they averaged a pretty healthy 18.0 fantasy points the next week.

Just over half of them (22) rushed for at least 100 yards the following week. Over half of them (23) had at least 125 total yards the following week. Twenty-nine (69%) of them scored at least one touchdown the week after. Only seven of them compiled fewer than 100 total yards while also failing to score a touchdown, including both Lewis and Peterson (after the San Diego game) last year.

Okay, but Turner did it on only 22 carries, which included a 66-yard touchdown. What if we only look at the guys who reached 200 carries on relatively low rushing attempt totals. Last year, Doug posted the list of 200 yard games with the fewest rushing attempts. Turner’s effort would now rank on that list tied for fifteenth. Every back that gets to 200 rushing yards in a game is necessarily getting some yards in chunks, but perhaps backs that reach it with fewer attempts regress more the following week because it was more related to the luck of one or two long runs.

While I can’t go back and tell you how Cliff Battles followed up his 200 yard rushing game for the Boston Redskins in 1933, I can look at the players since 1995. Besides Turner, thirteen players have rushed for 200 or more yards in a game on 25 or fewer rushing attempts. Here’s how they followed up the next week:

21.9 rush attempts, 100.3 rush yards, 25.2 receiving yards, 1.1 touchdowns, 19.0 fantasy points

So, the low carry group actually did slightly better than the group as a whole. Only three of them failed to reach 100 total yards the following week–Adrian Peterson last year against Dallas, Willie Parker in 2006 against Cleveland, and Marshall Faulk in 2000 against Kansas City (Faulk had 99 total yards). Only three of them failed to score a touchdown the week after–Warrick Dunn in 2000 against Miami, Edgerrin James in 2004 against Detroit, and Tiki Barber in 2005 against San Fransisco. Tiki Barber’s 10.3 fantasy points were the fewest among this group. Barry Sanders had the most, with an encore performance of 167 rushing yards and 3 touchdowns against the Bears in 1997.

If you came here for fantasy advice, waffling on whether you should start Michael Turner or Julius Jones this week, my advice, as controversial as it may be, is go ahead and start Turner.