Wednesday, January 25, 2012

Why Do Score Effects Exist?

First off, I must thank and give full credit to 2+2 poster atakdog. Most of this article is just a small model tweak and different presentation of his work in this post. Worlds are colliding!

We talk a lot about score effects. The idea is that when a team is behind they will tend to dominate and teams that are ahead find themselves in their own end more often than when the score is tied. This occurs because teams have different incentives - if you are ahead then preventing the other team from scoring is more important than scoring yourself, so you'll play less aggressively forcing the other team to work for it. Similarly, if you are behind then scoring becomes far more important so you are willing to take chances, pinch with your defensemen, have them jump into rushes and so on.

In this article, I will take a look at score effects by graphing out the incentive to score and prevent goals. In future articles, to come out in the next week, we will use a similar methodology to look at how the points system (2 for any kind of win, 1 for an OT/SO loss, 0 for a regulation loss) affects incentives and whether there might be better systems out there.

A Simple Model

It wouldn't be a JaredL article, or an atakdog derivative for that matter, without introducing a model. To look at the incentives to score and prevent goals, I took the average goals per team per game since the lockout, about 2.88, and divided by 60 to get the average goals per minute, roughly 0.048. I made this number the probability of either team scoring in any given minute. Using backward induction, I determined the probability of winning the game any given minute with any score difference. So this is two average teams facing each other, scoring at the average rate each minute and each with a 50% chance of winning the extra point in overtime/shootout should it get that far. I also assumed that if at any point in the game one team is up 10 goals then they will certainly win.

You may find it a little strange that we're looking at score effects using a model which assumes that they don't exist. A good way to think about this is to ask what happens if the other team plays exactly the same way no matter what the score is. How should we respond?

Score Tied

Let's start with the score tied. If this were baseball, basketball or either North American brand of football, this would be simpler. Unlike those sports, the NHL tiebreaking rules make hockey games non-zero-sum. If two teams tie, then the total number of points both get goes from 2 to 3. Note that in soccer it is exactly the opposite - if a game ends in a draw, the number of points drops from 3 to 2. The NHL rules actually make the incentive to score and prevent goals different when the score is tied, which we'll cover in greater depth in future articles.

Here is a graph showing the marginal benefit, in league points, of scoring and preventing a goal with the score tied. The horizontal axis represents what minute of the game it is and the vertical how many expected points are gained by scoring or preventing a goal. For scoring this would be the difference between starting the next minute up one and starting the next minute tied. For preventing this is the difference between starting the next minute tied and starting it down 1.


This gives us a somewhat strange pattern. Early in the game, scoring and preventing goals are about equally important. When the score is tied you will gain or lose about a third of a league point on average if a goal is scored. Very late in the game, this changes and getting to the 1-point-bonus round becomes the important thing. It's easy to see why we don't like this, but let's move on and look at score effects right now.

The One-Goal Game

Let's now shift to the score not being tied and start with the team that is ahead in a one-goal game. Here is a similar graph:


You can see that every minute of the game preventing a goal is more important than scoring. It's an interesting coincidence that at the start of the third period preventing a goal is almost exactly twice as important as scoring one. I think this is a bit overstated because my model does not take into account a team pulling the goalie, which I think will make it just a bit more important to be up 2 goals. Something worth noting is that the points system somewhat cushions the cost of conceding a goal here - if you are up one then giving up a goal very late isn't all that bad, at worst it costs you just over half a league point. Perhaps paradoxically, it's worse to give up a late goal when the score is tied than when you are up one.

Here is the graph for the team that is losing:


Perhaps the most clear thing from this is the already obvious justification for pulling the goalie - the last couple minutes giving up a goal almost doesn't matter at all while scoring is worth close to a point and a half. I've wanted to take a closer look at the optimal time to pull the goalie for a while, and hopefully will get to it, but just eyeballing this graph it seems earlier than usual might be better.

Another thing about this graph is that it provides some justification for the definition of close game, I believe first proposed by Eric T. over at Broadstreet and now used by many, including Gabe Desjardins for his power rankings, which he posts far more regularly than I (coming soon, I promise!). Under that definition, a game is close if it's within a goal in the first two periods and tied in the third. While such a definition is always somewhat arbitrary, we can see some justification for it by noting that the incentive to score for the team that is behind stays relatively flat earlier but really moves upward in the third period.

Something worth noting is that it is far more important for the team that is behind. Scoring a goal right at the end of regulation is three times as beneficial for the team that is behind than it is costly for the team that is up. Again, this is due to extra point being given in tie games. To better see the size of these effects, here's a graph with all four together:


Based on these incentives, it's not surprising that so much more of the play is in the leading team's end of the ice. The key for both teams is putting the puck in or keeping it out of that goal. The later in the game, the stronger this effect is and that's mostly because of how big a goal would be for the team that is behind.

Two-Goal Games

I'll be very quick with two-goal games and just show the combined graph. Here it's interesting because right near the end of the game it basically doesn't matter what happens - with a minute or two to go the team up two goals is almost certainly going to win whether they give up a goal or not. The key time is 10-15 minutes out when the trailing team has a decent chance to get another goal and equalize. Again, this slightly overstates the case because it assumes teams leave their goalie in there at the end, but the overall shape of the graphs would be mostly the same:



Score/Prevent Benefit Ratio

Finally, here's a graph of the ratio of the marginal benefit of scoring to preventing the other team from doing the same based on the score and time. The larger this is, the more important scoring a goal is relative to preventing one. If the ratio is greater than one that means scoring is more beneficial than giving up a goal is bad, the opposite if it is less than one.


I cut it off at 5 because it shoots way up for the team that is behind. With two minutes to go, scoring becomes 40 times as important as preventing the other team from putting the puck in your net if you are down 2. It is very clear that while teams that are ahead have some incentive to play more defensively, most score effects are driven by the team that is trailing.

In the next installment, I will look at how score effects, and play with the score tied, would be expected to change under alternative points systems such as 3-2-1-0, the simple 2-1-0 with ties at the end of regulation and the soccer system which is 3-1-0.

Wednesday, January 18, 2012

Was the Brian Elliott Extension a Good Idea?

Earlier today, the Blues decided to re-sign goaltender Brian Elliott to a 2-year extension worth $3.6 million in the midst of his first All-Star campaign. On the surface, this doesn't seem like a terrible move for St. Louis. After all, should Elliott keep form through the end of the season, another team would probably have to pay more than $3.6 million for his services. The biggest problem with this, however, is that Elliott's current form differs quite a bit from what we've seen over his career (numbers via NHL.com):

SeasonTeamGPES SAES GAES SV%
2007-2008OTT12210.955
2008-2009OTT31628500.920
2009-2010OTT5510891010.907
2010-2011OTT/COL5512371240.900
2011-2012STL22459250.946
Career3 Teams16434353010.912

We've seen teams give out similar contracts based on similar samples, contracts which don't always turn out so well in the department of expected performance (See: Leighton, Michael). At the end of the day, however, a $1.8 million cap hit isn't going to handcuff a team beyond repair. If Elliott turns out to be at least average or a little better, St. Louis will have him locked in at a very good price. If a 22 game sample indeed regresses back to his career averages, the modest average annual value will make this an easy contract to trade or demote, a win-win for the Blues.

Wednesday, January 11, 2012

On the Blue Jackets' Struggling Power Play

Yesterday I made a short post on the Blue Jackets' low PDO, making the point that we should expect the Jackets to have a much better second half as their shooting and save percentages climb back towards the mean. In the comments, Rob Vollman brought to my attention this Hockey Prospectus article by Timo Seppa, which brings into question Columbus' Power Play production dropoff after the transition from Ken Hitchcock to Scott Arniel. Timo asserts that
Arniel's track record wasn't favorable, particularly when compared to that of former coach Ken Hitchcock. In many aspects, the Blue Jackets had taken steps backwards since his departure, and they'd done no better than tread water elsewhere.

One example is on the power play. Never a particular strength of the Jackets—you need star players to have a truly upper echelon man advantage—the production of several key contributors had taken a highly visible nosedive in a year-and-a-half under Arniel's watch:

In other words, the dropoff in scoring under Arniel signifies a dropoff in overall power play production since Hitchcock left. This is a fine assertion to make judging by what these numbers tell us on the surface, however, I actually believe the Blue Jackets were a better PP team under Howson than Hitchcock. Let's take a look at why with some numbers via Behind the Net and NHL.com:

Season5v4 SF/60 (NHL Rank)5v4 SH%
2007-200844.8 (24)11.4
2008-200947.2 (25)9.4
2009-201051.8 (12)12.9
2010-201153.7 (8)8.9
2011-201253.5 (6)9.1

Timo only looks at '08-09 onward, but I decided to include '07-08 under Hitchcock since I don't have the ability to break down the '09-10 season with splits before & after Hitchcock was let go. Regardless, it is clear from these numbers that Columbus actually increased their shot rate on the PP under Arniel to levels that the Hitchcock-coached team were never able to reach.

How, then, could have Columbus actually posted better scoring results under Hitchcock? The answer is simple: shooting percentage. As Jared has pointed out, there is a tremendous amount of luck involved in shooting percentages, especially in small sample sizes. When you're looking at results on the power play, it is important to keep in mind that PP shots are going to represent about 1/4 (or less) of a team's total shots throughout the regular season, enough where luck is still going to be a major factor in scoring goals. The fact of the matter is this: the shots were going in under Hitchcock, and they weren't under Arniel. Regardless of whether Rick Nash was playing in front of the net or on the half-wall, the team was still unlucky to shoot at such low rates under Arniel, which resulted in lower point production for the team.

Though there were definite shortcomings in Arniel's system, e.g. going into an absolute shell when up by 1 or 2 goals, this is more evidence that he was yet again on the wrong side of luck.

Tuesday, January 10, 2012

On the Firing of Scott Arniel

Yesterday the Columbus Blue Jackets announced that they had relieved head coach Scott Arniel of his duties. This shouldn't come as a surprise to many, the Jackets are in the midst of a disappointing 11-25-5 start, good for a league worst 27 points in the standings. After an exciting offseason, the Jackets were poised to make a run at the playoffs as predicted by a slew of excellent bloggers. Before the 'I told you so' comments start pouring in, however, let's take a look at a few of Columbus' underlying statistics (via BTN here and here):

ES SH%ES SV%PDOScore-Tied Fenwick %
7.30.90597850.7

For the past few weeks, Gabe Desjardins has been writing about the importance of PDO and its tendency to regress towards the mean as the season progresses. Columbus is currently rivaling the worst PDO we've seen over the past four seasons, yet their possession statistics indicate that GM Scott Howson has indeed built a strong group of skaters. In net, even though Steve Mason may not be very good, he is also well below his career ES SV% numbers. This is good news for interim coach Todd Richards should (when) the Jackets see a revival of sorts during the second half of the season. We've already seen Ken Hitchcock receive media praise for a turnaround in St. Louis that was bound to happen anyway, and don't be surprised if we see the same for Todd Richards in Columbus.

Friday, January 6, 2012

Anomalies: Is Shooting Percentage Predictive?

There is a mountain of evidence that shooting percentage is overwhelmingly driven by luck. We've written a few articles on it and basically every top blogger has either written directly on the great role of randomness in shooting percentage or makes frequent use of that fact in analyzing hockey stats.

To summarize all of that work very briefly, over a season or less worth of data team shooting percentage is mostly driven by luck. It is not very sustainable, in other words there is very little correlation between shooting percentages in one period of time and another, whether that's even/odd numbered games, those in the first half of the season and second or from one season to the next. Stats such as shooting percentage, save percentage and the sum of these, referred to as PDO, show very high regression to the mean. So a team shooting for a low percentage in the first half of the season is essentially as likely as one shooting well to make a high percentage of shots in the second half of the year. I am fully on board with shooting results being mostly luck and have done a bunch of work on this myself.

A related but separate question is whether shooting percentage is predictive of future scoring. For example, is a team that shot at a high percentage in the regular season going to score more in the playoffs on average than a team that shot at a low percentage? All signs would seem to point to no. In addition to the work summarized above, if you just look at the correlation between shooting percentage in the regular season and scoring rate in the playoffs, it is very low. As you may have guessed by the existence of this column, and it having "anomalies" in the title, shooting percentage does turn out to have both statistical and, I would argue, actual significance in predicting future scoring.

Results

To study the predictability of regular-season shooting percentage on playoff scoring rate, I took 5-on-5 data from BTN for both the regular season and playoffs for the four seasons from 2007-2008 through 2010-2011. This gives us a sample of 64 team seasons. The variable we are trying to predict is goal-scoring rate (5-on-5 GF/60) in the playoffs. Here is the regression equation for playoff scoring rate on regular-season 5-on-5 shooting percentage (expressed out of 100) and, importantly, regular-season shooting rate (5-on-5 SF/60):

Playoff scoring rate = 0.213 * RS Sh% + 0.132 * SF/60 - 3.482

The coefficient on shot rate is nearly 4 standard errors greater than zero and very strongly significant. That should surprise nobody; shot rate is a solid predictor of future scoring. More surprising is that the p-value for regular-season shooting percentage is 0.014 which is easily significant at the standard 5% and almost significant at the stricter 1% level. Despite how much variance there is in playoff scoring rates, you have to factor in matchups and some teams only play 4 games, if you include shooting rate then shooting percentage is a statistically significant predictor of playoff scoring!

That's great for the stats nerds, but is it significant enough for anyone else to care? Let's look at this with a hypothetical example. Let's take two teams that had the playoff-team average shooting rate 5-on-5, give one the playoff-team average shooting percentage in the regular season and the other a shooting percentage one standard deviation higher. Here is a table with their regular-season shooting rates, RS shooting percentages and expected scoring rate in the playoffs:

RS Sh%RS SF/60Exp. Playoff GF/60
9.264%29.9272.438
8.44%29.9272.262

Going by average 5-on-5 ice time and series length, you could say that if two teams have the same shooting rate, the team with the good shooting percentage will score a little under a goal per series (0.81) better than the one that is average. If you compared one team with a high percentage and another below average it would be higher. If you'll forgive me for making simplifying assumptions such as independence, over 20% of first-round matchups should feature teams with shooting percentages different enough for their predicted expected goals for in the series to be off by more than a goal. I think it's large enough that this is significant in practice, not just statistically so.

Better to be good than lucky.

I want to emphasize that while shooting percentage appears to be a significant predictor, shooting rate is stronger. A good way to look at this is to consider predicted scoring rates for teams one standard deviation above the mean in one or both of these.

+1 SDShot RateShooting%Predicted GF/60
Both32.0149.26%2.713
Shot Rate32.0148.44%2.538
Shooting%29.9279.26%2.438
Both Avg29.9278.44%2.262

You can see from the bolded that a team with a higher rate but average percentage is predicted to score more goals than one with an average rate and higher percentage. Using a similar method as I used above, the team with the better rate will score just under half a goal more per series.

Spurious Explanations

So we have established that if you take two teams with equal shooting rates but different shooting percentages in the regular season, the one that made a higher percentage of its shots will have a significantly higher average scoring rate in the playoffs. I have come up with two theories on why this would be even with the extreme idea that shooting is all luck.

The first is that the higher shooting percentage means the team probably scored more goals, which means they probably had a better record and perhaps this means they faced weaker competition in the playoffs. So luck-based results in the regular season led to weaker playoff competition. This seems unlikely because changes would be so marginal, but in theory it could be the case.

A more complicated idea is based on score effects. It's pretty well established that teams tend to shoot at a higher rate when they are behind and a lower rate when they are ahead. Extending this, the team that got the bounces to go their way was in the lead more often. Since they were in the lead more often but put up the same shooting rate then they are better when it comes to shot rate and that will shine through in the playoffs.

Neither of these theories can be disproven 100%, but I think we can safely conclude that such effects would be quite small. I got at this by including a different variable: regular-season goal-scoring rate for all time that isn't 5-on-5. So this includes all special teams, 4-on-4 and 3-on-3. Since this is goal-scoring rate directly, and about 1 in 3 goals are scored in such situations, it seems like this would have a stronger connection to playoff seeding or being ahead than luck-based 5-on-5 shooting percentage. When I ran a similar regression of playoff scoring on 5-on-5 shooting rate and non-5-on-5 scoring rate the latter wasn't close to significant, with the standard error greater than the size of the coefficient and the R^2 barely budged from using shot rate alone. There are also other reasons I'll get to in the near future that make me think score effects are out as an explanation.

I think we can reasonably conclude that while there may be some negligible score or even playoff-matchup effects, that's not what's driving most of the predictive power of regular-season shooting percentage.

Shot Quality? Aim?

Having racked my brain for the week or so since discovering this, the only explanation I can find is our good friend shot quality. Getting high-quality shots isn't easy; the other team is doing their best to keep you out of the dangerous areas and stop rebounds from going there. If you have two teams that shoot at the same rate but one gets more high-quality shots then they are probably better possessing the puck and generating offense. Perhaps an entire season of 5-on-5 shots is enough for this to shine through.

There is some evidence to back this theory up. Running a regression we find that if you take into account regular-season SF/60, regular-season shooting percentage is a significant predictor of playoff shot rates! In other words, if you have two teams that shot at the same rate during the regular season then the one that shot for a higher percentage will, on average, have a higher playoff SF/60. The coefficient is 0.918, so increasing regular-season shooting% by 1.1 percentage points would increase the expected shot rate in the playoffs by 1.

To me shooting percentage indicating something about shooting rates in the future is pretty strong evidence that if either it's about shot quality and not sniping ability. If you had a team very good at hitting the corners during the regular season, it's not clear why they would be more likely to take more shots in the playoffs. In contrast, if you have a team very good at creating shots from just in front of the crease then it seems reasonable that they'd shoot more often in the future because that takes skill that more readily translates.

What do you think?

We'd love to hear your thoughts on this. In particular, I'd like any alternative theories beyond shot quality. Are you surprised at all by this? Does it make sense? Let us know by leaving a comment, or a tweet @drivingplay.