Friday, August 19, 2011

On Luck, Skill and Sample Size in Shooting Percentage

The roles of skill and luck in shooting are an important and often misunderstood part of hockey analysis. This is particularly true when analytical and traditional fans get together. A typical discussion might go something like:

A: Steven Stamkos got 91 points with 45 goals at the age of 21. He's definitely getting 100 points next year, and could easily score 50 goals a season as he improves.
B: Yeah, but he was lucky to make 16.5% of his shots. That's not sustainable and his numbers will probably go down next year, even if he does actually improve.
A: So shooting is all luck? You clearly know nothing about hockey and should actually watch some games instead of just sitting at your computer all day coming up with fancy stats that don't mean anything.

Turns out both guys have a point. Stamkos should probably expect his stats to drop next season because, in addition to goal numbers trending downward for the league as a whole, he's probably not burying 16.5% of his shots. On the other hand, I person B probably should watch more hockey, it is a great game.

A lot of the confusion comes from incorrect either/or thinking. Scoring on a high percentage of your shots is a result of both luck and skill. I'm not just putting down the traditional fans here. Those of us in the analytical community, myself included, are prone to bad thinking as well. While most people ignore or underrate the importance of luck in hockey in general and shooting in particular, we tend to go too far the other way, chalking everything up to luck and ignoring the skill aspect.

Shooting for a high percentage is skill based. Two of our favorite writers, JLikens of objectivenhl fame and Gabe Desjardins from BTN and arcticicehockey have written several articles on the subject.

In case their articles are not convincing enough, let's consider two of the best players in the game: Henrik Sedin and Sidney Crosby. While not necessarily known as snipers, these players have all the skills that one might think lead to their team putting a high percentage of shots in the net. Both have elite vision, passing ability, hands, positioning and, in one case, telepathy. We would all expect their teams to have better shooting percentages when they are on the ice than when they are sitting on the bench or worse. The numbers bear this out. Here is a chart with their teams' performances at even strength with both goalies in net from the last four seasons combined. The stats are courtesy of noted Driving Play reader Vic Ferrari's timeonice scripts, which you can find information on how to use here.

TeamGoalsShots On GoalShooting %
Penguins, Crosby on Ice236216010.9%
Penguins, Crosby off Ice41453477.7%
Canucks, Henrik on Ice273262410.4%
Canucks, Henrik off Ice35947097.6%

As you can see, the Pens with Crosby shot 3.2 percentage points higher than they did without him. While some of it may be variance, with the number of shots they took with him on, that's a difference of 69 goals or more than 17 goals per season due to better shooting. The Canucks shot 2.8 points higher with Henrik on the ice, a difference of over 73 goals, more than 18 per season, when you consider how many shots they took with him on. For the statistically minded, these shooting-percentage differences are very very very significant. To give you an idea, it varies field to field but the most common benchmark is for there to be less than a 5% chance of results this extreme, or more so, due to variance alone. That's a 1-in-20 chance. For Crosby, there is a 0.00045% chance, or less than 1 in 222,000. For Hank there is a 0.00235% chance of results that extreme due to randomness alone - less likely than 1 in 42,000. Again, 1 in 20 is the usual mark. The data confirm what anyone would guess from watching a few games - Henrik Sedin and Sidney Crosby help their teams shoot better. (Note: if you are a hater and/or think that it's the likes of Alex Burrows and Pascal Dupuis who are driving these results, feel free to be wrong. The point of this is to provide evidence of shooting skill and clearly someone has it when these two are on the ice.)

Let's now look at the role of luck on shooting percentage. To do this, I will run simulations comparing the results of a typical team that shoots well and one that does poorly. In this article on objectivenhl, which is worthy of being linked again, JLikens finds that the average team shoots at an 8.1% clip 5-on-5, with a standard deviation of 0.48%. Going by this, a team that is good at shooting, let's say 7th or 8th best in the league, would have a true 5-on-5 shooting percentage of something like 8.42%. On the other hand, a team that is bad at shooting, say 7th or 8th worst in the league, would be expected to score on about 7.78% of their shots.

Let's see how things shake out. Below is a chart giving the results of 10,000 simulations for various numbers of shots where team A has a true shooting percentage of 8.42% and team B shoots at 7.78%. The first two columns tell you the given time period and number of shots for each team. The next three columns tell you how often the team good at shooting outshot the bad (column 3), the bad team outshot the good (4) and how often they had an equal shooting percentage (5). The last two columns give what percent of the time someone looking at the data, and not knowing the underlying percentages, would get statistical signficance at the 5% level. Notice that in the last column, the statistical test would reveal that B is significantly better at shooting than A despite their shooting skill actually being over half a percentage point worse.

Time periodNumber of ShotsA scores moreB scores moreGoals scored equalA > B SignificantB > A significant
One Period831.9%28.7%39.4%1.3%1.1%
One Game2442.6%36%21.4%6%4.5%
1/4 Season50062.3%33.2%4.5%10.5%2.2%
1/2 Season1,00068.6%28.5%2.8%13.4%1.4%
1 Season2,00076.4%21.7%1.9%17.9%0.8%
2 Seasons4,00085%14%1%27.8%0.2%
3 Seasons6,00090.1%9.3%0.6%36.1%0.1%
4 Seasons8,00093.6%6%0.4%43.7%0.1%
5 Seasons10,00095.6%4.1%0.3%50.3%0%

You probably didn't find the results surprising for that first row, representing a period of play. The most common outcome, happening about 40% of the time, is that the two teams remain tied, most often at 0. The team that shoots better due to getting higher-quality shots, hitting the corners better and so on is only slightly more likely to be the one that is ahead if you know that one of them is. Less than 32% of the time will the better team find themselves ahead after a period in which both get the league average 8 shots, whereas they'll be behind almost 29% of the time.

Lower on the chart it gets more troubling, especially for us bloggers. The most common sample point for analysis is half a season. Generally the best way to study the persistence of something is to split the season in half, typically first half vs second half or even-and-odd numbered games, and compare the two samples. This works well because teams should be the same or very similar. If you study something over multiple seasons you aren't getting the same teams every year due to player and coaching changes. In half a season, the team near the top in shooting skill has only about a 2 in 3 chance of outscoring the team near the bottom with the same number of shots. There is also little chance, roughly 13%, of finding that the better team is significantly better at shooting if you were looking at the data. Even over a whole season of shooting data, there is a 1 in 4 chance that the worse team will get better results. It isn't until we get several years worth of shooting results that it tilts heavily in favor of the better shooting team and that's not realistic because teams change so much each offseason and the simulations assumed the same percentage each season.

As you can see, luck plays a huge role for all reasonable sample sizes. This is the fundamental reason why shooting stats are better than goals. Luck is less of a factor for number of shots taken than number of shots made, so they are more reliable indicators of skill over samples of a season or less. If over a season there is a 1 in 4 chance that a good-shooting team is outshot by a bad-shooting team then it's tough to say that a team's results are due to skill and not just random luck.

In a future installment I will look at how persistence is affected by sample size.

1 comment: