Goaltending—Game Theory, the Contrarian Position, and the Possibility of the Extreme

Preamble: The following is a paper I wrote while in college about 6 years ago. It is a slightly different approach and worse logic that I employ now, likely reflecting my attitude at the time – a collegiate goaltender with the illusion of control (hence goals were likely unpredictable events, else I would have stopped it). I have softened on this thinking, but still think the recommendation holds: goaltenders can outperform the average by mixing strategies and adding an element of unpredictability to their game.

 

How goaltender strategy and understanding randomness in hockey can lend insight into the success of truly elite goaltenders.

Introduction

This paper outlines general strategies and philosophies behind goaltending, focusing on what makes great goaltenders great. Philosophy and goaltending make interesting partners—few athletic positions are continuously branded with a ‘style.’ Since such subjective labels are the norm for this position, then I feel quite comfortable using the terms rather broadly in a philosophical analysis. I will use loose generalisations to formulate a big-picture view of the position—how it has evolved, the type of goaltender that has consistently risen above their peers during this evolution, and why. Using game theory and attempting to clearly label player strategies is, at times, clumsy. Addressing the impact of unquantifiable randomness in hockey does not provide much comfort either. However, the purpose is to encourage further thought on the subject, and not provide a numerical, concise answer. It is a question that deserves more thought, at both the professional (evaluation and scouting) and grass-root (development and training) level. The question: what makes a consistently great goaltender?

Game Theory—The Evolution of Goaltending Strategy

Passive ‘blocking’ tactics have become prevalent among goaltenders at all levels. It is simple, statistically successful, and passive. There are tradeoffs like any strategy—the goaltender forfeits aggressiveness in order to force the shooter to make perfect shots to beat them. This ‘fated’ strategy exposes the goaltender to the extreme—most goals allowed are classified as ‘great plays’ or ‘lucky,’ certainly not the fault of the goaltender. However, there are other considerations. Shooters, no doubt, have adjusted their strategy based on this approach, further compromising the passive approach to goaltending. This means a disproportionate number of shooters will look to make ‘perfect’ shots—high and tight to the post against a blocking goaltender—despite the risk of missing the net entirely.

Historically, goaltenders did not have the luxury of light, protective equipment that is designed specifically to seal off any holes while in a butterfly position. Equipment lacking proper protection and effectiveness required goaltenders to spend the majority of the time on their feet while facing shots.

Player/Goaltender Interactions Then and Now

Game theory applications allow a crude analysis of the evolution of strategies between players and goaltenders. The numbers I use are arbitrary, however, they demonstrate an important strategic shift in goaltending tactics. First, let us assume that players have to decide whether to shoot high or low and always try to shoot for the posts. Simultaneously, goaltenders must choose to block or react.

In the age of primitive equipment, goaltenders were required to stand-up most of the time to make saves. From here we can make three assumptions in this ‘game’ or ‘shot’: 1) While blocking, the goaltender’s expected success rate was the same if the shooter shot high or low. Since the ‘blocking’ tactic was simply standing up and challenging excessively when possible, it would not matter if the player shot high or low, the goaltender was simply covering the middle of the net. 2) While reacting, high shots were easier saves than low shots. Goaltenders generally stood-up, which make reach pucks with the hands easy and reaching pucks with the feet hard. 3) Goaltenders were still better reacting than blocking on low shots, since players will always shoot for the posts.

We can then use the iterated elimination of dominated strategies technique to find a dominant strategy for each player. In this scenario, goaltenders are always more successful, on average, reacting than blocking. Since goaltenders will always react, shooters acknowledge they are generally better off shooting low than high (while this is just a fabricated example, the fact goaltenders survived without helmets might prove this). Regardless, the point of this exercise demonstrates that goaltenders needed to have the ability to react to shots during this time. These strategies and the expected save percentages are displayed in the matrix below (Figure 1). Remember goaltenders want the highest save percentage strategy, while shooters want to find the lowest.


However, the game of hockey is not as simple as the pure simultaneous-move game we have set up. Offensive players are not shooting in a vacuum. They are often facing defensive pressure or limited to long distance shots, both circumstances limit the ability of offensive players to accurately shoot the puck. If the goaltender believes his team will be able to limit the frequency of high shots to less to 50%, then the goaltenders expected save percentage while blocking is greater than their expected save percentage while reacting.Advances in equipment then allowed the adoption of a new blocking tactic—the butterfly. By dropping to their knees and flaring out their legs, goaltenders were maximising their blocking surface area, particularly along the ice. Equipment was lighter, bigger, and increasingly conducive to the butterfly style, allowing goaltenders to perform at higher levels. Now the same simultaneous-move game described above began to increasingly favour the goaltender. Not only did the butterfly change the way goaltenders blocked, it changed the way they reacted. Goaltenders now tended to react from a butterfly base—dropping down to their knees at the onset of the shot and reacting as they dropped. The effectiveness of the down game now meant shooters were always better off shooting high. In a pure game theory sense, this would suggest players would always shoot high, so goaltenders should still always react. These strategies and the new payoffs are displayed in Figure 2.


This suggests that goaltenders with a good defence, good blocking technique, and modern goaltending equipment are better off blocking. When a goaltender is said to be ‘playing the percentages,’ this suggests the goaltender routinely blocks the majority of the net and forces the shooter to make a perfect shot. This strategy has raised the average performance of goaltenders. However, in a zero-sum game such as hockey, simply maintaining a level of adequate performance will not increase the goaltender’s absolute success, measured in wins and losses. The only way for a goaltender to positively impact their team is to exceed the average, which—as we will see—can be accomplished by defying the norm.

In conclusion, these strategic interactions did not create hard rules for goaltenders or shooters. However, the permeation of advanced tactics has heavily skewed the payoffs toward the goaltender. Goaltenders block more, and shooters shoot high as much as possible. An unspoken equilibrium has been created and maintained at all levels of hockey—thus altering the instinctive strategies employed by both groups.

The ‘Average’ Position

Goaltenders could now simplify their approach to their position, while simultaneously out-performing their historical predecessors. The average NHL save percentage rose from 87.6% in 1982 to 91.6% in 2011.* This rise in success rate would give any goaltender little incentive to break the norm. Imagine an ‘average’ goaltender, posting a save percentage equivalent to the NHL average save percentage each year. The ‘average’ goaltender would put up better numbers each successive year. While they would be perceived to be more valuable—higher personal statistics means a bigger contract, more starts, and a greater reputation—it is entirely conceivable that, despite their statistical improvement, they would not contribute to any more victories. If the goaltender at the other end of the ice is performing just as well as you (on average, of course) then the ‘average’ goaltender will not contribute any extra wins to his team compared to the year before. However, this effect would be difficult to observe over the course of a goaltenders career, and coaches and managers would become enamoured with ‘average’ goaltending, comparing it favourably to the recent past. The ‘success of mediocrity’ encouraged a simplified, safe, and ‘high-percentage’ approach to the position. If you looked like other goaltenders, played like other goaltenders, and performed like other goaltenders, there was little reason to worry about job security. In short, through the evolution of goaltending, goaltenders generally have had very little to gain from breaking the idyllic norm of how a goaltender should look or play like. The implicit equilibrium between shooters and goaltenders has persisted across different eras—most recently centring around a ‘big butterfly, blocking’ game, resulting in historically superior statistics for the ‘average’ goaltender.

The Limits of Success

There is no doubt that now the craft of goaltending is significantly superior to the efforts that preceded it. Goaltenders today are bigger, faster, more athletic, and advanced technically. However, the quest to fulfil the requirement of ‘average’ will be an empty pursuit in absolute terms (wins and losses) to any goaltender. In order to avoid becoming ‘average’ the goaltender must deviate from the strategic equilibrium that primarily consists of large goaltenders simply ‘playing the percentages.’ While goaltenders can exceed the average by simply being even bigger, faster, and more athletic than their peers, this is becoming increasingly difficult. Not only will teams continue to draft goalies for these attributes, there are natural limits to how tall, fast, and coordinated a human being can be. Shooters will also continue to adjust. An extra 2” in height does not necessarily prevent a perfectly placed shot over or under the glove. Recall the over simplified instantaneous move game: shooters will always be better off shooting high and to the posts—when they have time. High-level shooters have evolved to target very specific areas of the net, preying on the predictability of the modern butterfly goalie. However, the shooter will not always have time to attempt the perfect shot, which means the goaltender can revert back to primarily blocking and mediocrity without being exposed.

 

 

The Contrarian Position

While the goaltender cannot change his physiology in order to exceed the average, they can (slowly) alter their approach to the game. Remember, the strategic interaction between the goaltender and shooter has become predictable. The goaltender will fill up as much net as possible, forcing the shooter to manufacture a perfect shot, while the shooter will attempt to comply.  If a goaltender were to begin to mix strategies effectively and react some percentage of the time, they would be better off. The shooter has been trained to shoot high (that is their dominant strategy), and goaltenders are better off reacting to high shots than blocking and leaving their arms pinned to their sides. Essentially, by mixing strategies when it is wise, (when the simple block-react instantaneous move model applies) the goaltender can increase their expected save percentage—and exceed the average.

To demonstrate this point we must move away from the abstract and the general, focusing on specific examples. A disproportionate amount of statistical success throughout the ‘butterfly’ era has been the work of unorthodox goaltenders. While an ‘unorthodox’ style has had a negative connotation in the conventional world of goaltending, it is the defectors that have broken through the limits reached by the big, butterfly goaltender. Sub-six-foot Tim Thomas recently broke the modern NHL save percentage record by willing himself to saves and largely defying the established goaltending practice. The save percentage record previously belonged to Dominik Hasek. Like Thomas, Hasek was less than six feet tall and would consistently move toward the puck like no other goaltender in the game. To shooters that have very clear, habitual objectives (shoot high glove or low blocker just over the pad or through his legs if he is sliding, etc.) facing these contrarians led to a historically low shooter success rate. These athletes effectively mixed their strategies between blocking and reacting (their own versions of these strategies, mind you) to keep shooters guessing. Their contrarian approach has been remarkably sustainable as well—Hasek and Thomas have combined to win 8 out of the last 17 Vezina Trophies, despite their NHL careers only overlapping 3 years. By moving further away further the archetypical goaltender, both Thomas and Hasek exceeded the average considerably. It is exceeding the average that causes goaltenders to contribute to victories, the absolute measurement of success for any goaltender.

Consider the correlation between a unique approach and sustained success when accessing the careers of four Calder Trophy winning goaltenders: Ed Belfour, Martin Brodeur, Andrew Raycroft, and Steve Mason. Each began their NHL career in impressive fashion; however, two went on to become generational goaltenders, while the other two will struggle to equal their initial success. This may seem like an unfair comparison, but it is important to understand why it unfair. Both Brodeur and Belfour maintained an elite level of play because they generally defied convention throughout their career. Both played unique styles and were excellent puck handlers. When Belfour entered the league at the very start of the 1990’s his combination of athleticism, intensity, and an advanced understanding of positional play made him formidable. He mastered the butterfly before it was the standard—you could argue the success of Patrick Roy and Belfour helped create the current generation of ‘big, butterfly’ goaltenders. Brodeur has always been different—there has been no comparable goaltender to him throughout his career, just like Thomas or Hasek. He has been the most consistent and celebrated goaltender in NHL history without utilising the most common save tactic employed by his peers—he rarely drops into a true butterfly. Counter-intuitively, despite lacking a standard, universal save movement, he has also been remarkably consistent. Martin Brodeur has mixed his save selection strategies magnificently, preying on shooter programmed to shoot against predictable butterfly practitioners.

Now consider the other rookie standouts: Raycroft and Mason. It is difficult to distinguish their approach to the game from the approach of other ‘average’ professions. Mason is taller than average and catches right, but he does not present a unique challenge to shooters. They are goaltenders with an average, ‘percentage-based’ approach to goaltending. There is nothing note-worthy about the way they play the position. Why the initial success? Both goaltenders likely overachieved (positive deviation from the average) due to a favourable situation and the vague element of surprise. Shooters would soon adjust to the subtleties in the young goaltender’s game.* Personal weaknesses would become exploited and their performance regressed towards the mean. Their rookie years could have been duplicated by a number of other rookie goaltenders, with similar skill and luck. Their ‘average’ size, skill set, and approach to the game have manifested itself in an ‘average’ NHL career. An impressive beginning was nothing more than favourable luck and circumstance—their careers diverged significantly from other Calder-winning goaltenders. Goaltenders that went throughout their career masterfully mixing save selection strategies, by contrast, set the standard for consistency, longevity, and performance.

In conclusion, the modern equilibrium between goaltenders and shooters has been successfully disrupted by the contrarians like Dominik Hasek, Tim Thomas, and Martin Brodeur. The rest have enjoyed the benefits of the ‘big, butterfly goaltender’ doctrine—stopping more pucks on average—but have gained little ground on other ‘average’ goaltenders. These goaltenders are playing a strategy that contributes little to their team because they are more susceptible against the extreme.

 

The Possibility of the Extreme—The Black Swan Save 

If contrarians exceed the average, it is important to understand how they can do it with remarkable consistency. I believe their unconventional style and willingness to react to shots leaves them better prepared to handle the possibility of the statistically unique shot—which I will call a ‘Black Swan’ opportunity.§ They can always use the butterfly tactic in situations that call for it, while the butterfly-reliant goaltenders struggle to improvise like contrarians. The ‘reaction’ strategy leaves them free to make the unconventional saves necessary to prevent Black Swans from becoming goals.

The position relies on instinct and split second decisions. Reactions and responses to defined situations are drilled into goalies from an increasingly young age. Long before these goaltenders are capable of playing in the NHL, they have generally mastered technical responses to certain, finite situations. Goaltenders may be trained very well to react predictably in trained circumstances, but this leaves the goaltender susceptible to the extreme—breeding mediocrity. In this case, the extreme or Black Swan shot, is the result of 10 position players on the ice, moving at speeds up to 30 miles per hour, chasing an object that can move close to 100 miles per hour. Despite the simple objective and the definitive results of the goaltending position, every shot against them has the potential to create an infinite amount of complexities and permutations. A one-dimensional approach—where the goaltender determines they are better off ‘playing the percentages’—to the position offers the goaltender the opportunity to make a large number of saves, but it does not prepare the goaltender to react favourably to a Black Swan. The problem, then, is not maintaining a predictable level of performance—making the saves ‘you should make’—it is the ability to adjust to the unpredictable and the extreme in order to make a critical save. This is accomplished by reacting to shots a healthy percent of the time.

The real objective of the goaltender is to give up fewer goals than the opposing goaltender. In a low scoring game such as hockey, it is likely one goal against will determine the outcome of any given game. Passively leaving the outcome up to chance is a mistake in my opinion. Aggressiveness and assertiveness are competitive qualities that are compromised by a predominantly butterfly style. By dropping in the butterfly the goaltender is surrendering to whatever unlikely or unlucky shot that may occur. A great play, a seeing-eye shot, or unlikely bounce—the ‘unlikely, undrilled’ occurrences that have the potential to win or lose games—happen randomly. The goaltender must be aggressive and decisive in order to adjust to these situations. These are the shots that cannot be replicated in repetitive drills; they require the creativity and instinctual reaction of an instinctual contrarian.

Goaltending—A Lesson in Randomness

The frequency of the Black Swan shot or goal against is erratic. They can happen at any time. There is little correlation between shots against and goals against on a game-by-game basis. If we assume the amount of Black Swan’s a goaltender faces is roughly proportional to the number of goals given up*— generally the more improbable shots faced, the more goals against—we counter-intuitively observe that the ‘Black Swans’ and the goals they caused occur randomly in a hockey game, largely independent of the number of shots against the goaltender. Taking the 10 busiest goaltenders of the 2010-2011 season, we see that their save percentage generally goes up as they receive more shots against. It does not matter whether the team gives up 20 shots or 40 shots, the random Black Swan occurrences that result in goals will happen just as frequency, regardless of the shots against. In outings where those goaltenders faced more than 40 shots, the average save percentage and shots against were 94.63% and 43.51, respectively. This implies these goaltenders gave up, on average, 2.33 goals per game when facing more than 40 shots. When these same goaltenders faced less than 20 shots, their save percentage was a paltry 82.17% on an average of 14.85 shots. This implies 2.64 goals against per outing where the goaltender faced less than 20 shots.§ Counter-intuitively they fared worse while facing less than half of the shots.

The frequency of the ‘Black Swan’ occurrences that led to goals appears to be largely independent of shots on goal. ‘Playing the percentages’ leaves every goaltender hopelessly exposed to random chance throughout the game. Goaltenders in the world’s best league do no better in absolute terms when they face 20 shots than 40 shots. They are the same goaltenders, they just fall victim to circumstance and luck.

Simply ‘playing the percentages,’ with an emphasis on blocking from the butterfly, leaves the goaltenders fate up to pure chance. No goaltender can attempt to consistently out-perform their peers by playing the percentages—at least, not with certainty. Hoping to block 90% of the net while relying on your team to limit quality opportunities will result in mediocrity. The Black Swan events that lead to goals occur randomly and just as frequently facing 15 shots as 50 shots. This has manifested itself in ‘average’ goaltenders’ performances fluctuating unpredictably from game to game and from season to season. In a game where random luck is prevalent, employing a strategy that struggles to adjust to the complexities of a game as dynamic as hockey will result in erratic and unexplainable outcomes.

The Challenge to the Contrarian

This creates a counter-intuitive result: the prototypical, ‘by the book’ goaltender will likely be subjected to greater fluctuations in performance, despite having the technical mastery of the position that suggests a level of control. Instead, it is the contrarian, with no attachment to the ‘proper’ way to make the save that will achieve more consistent results. The improvisational nature of a Tim Thomas stick save may appear out of control, but his approach to the game will yield more consistent results. The aggressiveness and assertiveness will allow the contrarian to make saves when there is no technical road map to reach the proper position on a Black Swan shot. Consider the attributes necessary the make an incredible save. Physical attributes vary among NHL goaltenders, but not by much. Height, agility, reflexes, and other critical skills for any professional goaltender will cluster around a certain standard. On the other hand, the mental approach to the game can vary between goaltenders by magnitudes. Goaltenders can become robust against the effects of Black Swans by having the creativity to reach pucks ‘technicians’ could not and having the courage to abandon the perceived safety of the butterfly. Decreasing the effects of Black Swan’s would be huge, and there are no theoretical limitations (unlike physical limitations) that exist. In a game containing the possibility of the extreme, it is the contrarian goaltender that will best be able to prevent goals against.

Leaving the safety of the ‘butterfly style’ can be dangerous for a goaltender. Coaches, managers, analysts, and peers will be quick to realise when a goal could have been stopped by a goaltender passively waiting in their butterfly. These ‘evaluators’ and ‘experts’ have subscribed to the ‘average’ goaltender paradigm for over a decade. After game 5 of the 2011 Stanley Cup Final, Roberto Luongo suggested that the only goal of the game against Tim Thomas would have been “an easy save for (him).” Proactively mixing save strategies does leave the contrarian potentially exposed to the unconventional goal against. Improbable, unconventional saves are great, but coaches and managers really only care about goals against. They can handle them if it was not the fault of the goalie—the perfect shot or improbable bounce that prey’s upon the passive butterfly goaltender. Just don’t pass up the opportunity to make an easy save and get scored on, contend the experts (luckily, Thomas was able to put together the greatest season of any goaltender in the modern game, he got a pass). Playing the game like freed from the ‘butterfly-first’ doctrine is a leap of faith, but it gives the goaltender the opportunity to contribute something positive to their team: wins.

Consider the great Martin Brodeur—the winningest goaltender in NHL history has often been discredited for playing behind strong defensive clubs while winning games and championships. However, random Black Swan chances have little regard for the number of shots against, as we have seen.  So why does Martin Brodeur have the most victories of any goaltender in NHL history? I would give a large amount of credit to his ability to make the ‘key save’ on the unlikely chance against. These saves would not necessarily manifest themselves noticeably at the end of the game or in any statistically significant way—rather they are randomly distributed throughout the game, like Black Swan’s are. Remember that, while New Jersey has been traditionally strong defensively, they have averaged 16th in the league in scoring during Brodeur’s tenure. With this inconsistent (and at times lethargic) goal support, Brodeur’s win totals remained remarkably consistent. During his prime he recorded at least 37 victories in 11 consecutive seasons. The low scoring years required extreme focus and competency. Where the game could hinge on one great play or bad bounce, Brodeur preserved victory more than any contemporary by being vigilant against the Black Swan chances. You can make the argument the low shot totals (and the subsequent merely ‘good’ save percentage) led to him being overrated considering his absolute success. However, Black Swan’s are somewhat independent of shots against, and until his detractors understand how three ‘Brodeur-only saves’ were the difference in a 3-2 win in a game where New Jersey gave up only 23 shots, the winningest goaltender of all-time will continue to be regrettably underrated, except for where it counts. No statistical analysis can measure the increased importance of a save to preserve victory compared to a save without that pressure.

Conclusion

I felt it was important to actively think about the strategies that have permeated the goaltending position and the impact it has had on goaltending performance. It was also important to liberate my thinking from too much quantitative analysis, rather focusing on the qualitative relationships between goaltender strategy, the random nature of the position, the goaltenders that consistently exceed the norm, and the goaltenders that will always be products of circumstance. None of this could be done with traditional goaltender metrics, they do not begin the even consider the possibility of the Black Swan opportunity against. Traditional statistics can be manipulated to underrate the winningest goaltender of all-time. Winning is sport’s sole objective, the goaltender always has some influence on winning, so goaltender wins are important. Traditional statistics lead to complacency with ‘average’ goaltending, which is goaltending that adds nothing to the bottom-line—winning. Leaving these statistical constraints behind can help clarify the connection between strategy and the contrarian, then between the contrarian and success.

Based on this philosophical analysis, I believe goaltenders should unsubscribe from the conventional goaltending handbook, aggressively mix their save selection, helpful remaining robust against the inevitable Black Swans opportunities against. This will allow them to exceed the ‘expected’ performance, and ultimately win more games.

____________________________________________

* A 4% increase in save percentage is significant; this is analogous to saying goaltenders gave up 48% more goals of the same number of shots in 1982 than 2011.

* While the butterfly style may be generic, each goaltender has relative strengths and weaknesses. NHL shooters will eventually expose these weaknesses unless the goaltenders can successfully vary their strategy (remain unpredictable).

In the ‘modern’ game-theory example, the goaltender would have to react the vast majority of the time to force the shooter to mix between shooting high or low (which is ideal for the goaltender). By doing so the goaltenders can exert their influence on the shooter, opposed to simply accepting that a great shot or lucky bounce will beat them.

  • A term borrowed from Nassim Nicholas Talib and his book The Black Swan: The Impact of the Highly Improbable. Black Swan’s, named after the rare bird, represent the improbable and random occurrences in hockey and in life. Just because we cannot conceive a particular challenge nor have we prepared for it, does not mean it will not happen. ‘Black Swans’ are unpredictable, can have a large impact (a goal), and are the result of an ecosystem that is far too complex to predict (10 players, a puck, and physics create infinite possibilities). Events are weakly explained after the fact (you held your glove too high) but in reality the causes are much deeper and impossible to predict.

* While I would argue some goaltenders are better equipped to handle ‘Black Swan’ opportunities against them, these difficult, unforeseen events will still be approximately proportionate to the amount of goals they give up. NB: Tim Thomas is not included in this list.

This ‘extreme’ case happened 47 times out of the 677 games collectively played.

  • Many of these games saw the goaltender pulled, so the goals against is ‘per appearance’ rather than ‘per game.’ While it may be argued that these goaltender just ‘didn’t have it’ these games, I would argue that more often they faced a cluster of bad luck and improbable chances against them. The total sample size is 60 games.

This attitude may explain the regression in Luongo’s game over the last couple of seasons. He once was a 6’3 goaltender with freakishly long limbs that would reach pucks in unconventional and spectacular ways. Now he views himself as pure positional goaltender that is better off on the goal line than aggressively attacking shots against him. Apparently it is better to look ‘good’ getting scored on multiple times than look ‘bad’ getting scored on once.

The standard deviation is 10 places, basically all over the place, both leading the in goals for and finishing last in goals for.

Hockey Analytics, Strategy, & Game Theory

Strategic Snapshot: Isolating QREAM

I’ve recently attempted to measure goaltending performance by looking at the number of expected goals a goaltender faces compared to the actual goals they actually allow. Expected goals are ‘probabilitistic goals’ based on what we have data for (which isn’t everything): if that shot were taken 1,000 times on the average goalie that made the NHL, how often would it be a goal? Looking at one shot there is variance, the puck either goes in or doesn’t, but over a course of a season summing the expected goals gives a little better idea of how the goaltender is performing because we can adjust for the quality of shots they face, helping isolate their ‘skill’ in making saves. The metric, which I’ll refer to as QREAM (Quality Rules Everything Around Me), reflects goaltender puck-saving skill more than raw save percentage, showing more stability within goalie season.

Goalies doing the splits

Good stuff. We can then use QREAM to break down goalie performance by situations, tactical or circumstantial, to reveal actionable trends. Is goalie A better on shots from the left side or right side? Left shooters or right shooters? Wrist shots, deflections, etc? Powerplay? Powerplay, left or right side? etc. We can even visualise it, and create a unique descriptive look at how each goaltender or team performed.

This is a great start. The next step in confirming the validity of a statistic is looking how it holds up over time. Is goalie B consistently weak on powerplay shots from the left side? Is something that can be exploited by looking at the data? Predictivity is important to validate a metric, showing that it can be acted up and some sort of result can be expected. Unfortunately, year over year trends by goalie don’t hold up in an actionable way. There might be a few persistent trends below, but nothing systemic we can that would be more prevalent than just luck. Why?

Game Theory (time for some)

In the QREAM example, predictivity is elusive because hockey is not static and all players and coaches in question are optimizers trying their best to generate or prevent goals at any time. Both teams are constantly making adjustments, sometimes strategically and unconsciously. As a data scientist, when I analyse 750,000 shots over 10 seasons, I only see what happened, not what didn’t happen. If in one season, goalie A underperformed the average on shots from the left shooters from the left side of the ice that would show up in the data, but it would be noticed by players and coaches quicker and in a much more meaningful and actionable way (maybe it was the result of hand placement, lack of squareness, cheating to the middle, defenders who let up cross-ice passes from right to left more often than expected, etc.) The goalie and defensive team would also pick up on these trends and understandably compensate, maybe even slightly over-compensate, which would open up other options attempting to score, which the goalie would adjust to, and so on until the game reaches some sort of multi-dimensional equilibrium (actual game theory). If a systemic trend did continue then there’s a good chance that that goalie will be out of the league. Either way, trying to capture a meaningful actionable insight from the analysis is much like trying to capture lightning in a bottle. In both cases, finding a reliable pattern in a game there both sides and constantly adjusting and counter-adjusting is very difficult.

This isn’t to say the analysis can’t be improved. My expected goal model has weaknesses and will always have limitations due to data and user error. That said, I would expect the insights of even a perfect model to be arbitraged away. More shockingly (since I haven’t looked at this in-depth, at all), I would expected the recent trend of NBA teams fading the use of mid-range shots to reverse in time as more teams counter that with personnel and tactics, then a smart team could probably exploit that set-up by employing slightly more mid-range shots, and so on, until a new equilibrium is reached. See you all at Sloan 2020.

Data On Ice

The role of analytics is to provide a new lens to look at problems and make better-informed decisions. There are plenty of example of applications at the hockey management level to support this, data analytics have aided draft strategy and roster composition. But bringing advanced analytics to on-ice strategy will likely continue to chase adjustments players and coaches are constantly making already. Even macro-analysis can be difficult once the underlying inputs are considered.
An analyst might look at strategies to enter the offensive zone, where you can either forfeit control (dump it in) or attempt to maintain control (carry or pass it in). If you watched a sizable sample of games across all teams and a few different seasons, you would probably find that you were more likely to score a goal if you tried to pass or carry the puck into the offensive zone than if you dumped it. Actionable insight! However, none of these plays occur in a vacuum – a true A/B test would have the offensive players randomise between dumping it in and carrying it. But the offensive player doesn’t randomise, they are making what they believe to be the right play at that time considering things like offensive support, defensive pressure, and shift length of them and their teammates. In general, when they dump the puck, they are probably trying to make a poor position slightly less bad and get off the ice. A randomised attempted carry-in might be stopped and result in a transition play against. So, the insight of not dumping the puck should be changed to ‘have the 5-player unit be in a position to carry the puck into the offensive zone,’ which encompasses more than a dump/carry strategy. In that case, this isn’t really an actionable, data-driven strategy, rather an observation. A player who dumps the puck more often likely does so because they struggle to generate speed and possession from the defensive zone, something that would probably be reflected in other macro-stats (i.e. the share of shots or goals they are on the ice for). The real insight is the player probably has some deficiencies in their game. And this where the underlying complexity of hockey begins to grate at macro-measures of hockey analysis, there’s many little games within the games, player-level optimisation, and second-order effects that make capturing true actionable, data-driven insight difficult.[1]
It can be done, though in a round-about way. Like many, I support the idea of using (more specifically, testing) 4 or even 5 forwards on the powerplay. However, it’s important to remember that analysis that shows a 4F powerplay is more of a representation of the team’s personnel that elect to use that strategy, rather than the effectiveness of that particular strategy in a vacuum. And team’s will work to counter by maximising their chance of getting the puck and attacking the forward on defence by increasing aggressiveness, which may be countered by a second defenseman, and so forth.

Game Theory (revisited & evolved)

Where analytics looks to build strategic insights on a foundation of shifting sand, there’s an equally interesting forces at work – evolutionary game theory. Let’s go back to the example of the number of forwards employed on the powerplay, teams can use 3, 4, or 5 forwards. In game theory, we look for a dominant strategy first.While self-selected 4 forward powerplays are more effective a team shouldn’t necessarily employ it if up by 2 goals in the 3rd period, since a marginal goal for is worth less than a marginal goal against. And because 4 forward powerplays, intuitively, are more likely to concede chances and goals against than 3F-2D, it’s not a dominant strategy. Neither are 3F-2D or 5F-0D.
Thought experiment. Imagine in the first season, every team employed 3F-2D. In season 2, one team employs a 4F-1D powerplay, 70% of the time, they would have some marginal success because the rest of the league is configured to oppose 3F-2D, and in season 3 this strategy replicates, more teams run a 4F-1D in line with evolutionary game theory. Eventually, say in season 10, more teams might run a 4F-1D powerplay than 3F-2D, and some even 5F-0D. However, penalty kills will also adjust to counter-balance and the game will continue. There may or may not be an evolutionary stable strategy where teams are best served are best mixing strategies like you would playing rock-paper-scissors.[2] I imagine the proper strategy would depend on score state (primarily), and respective personnel.
You can imagine a similar game representing the function of the first forward in on the forecheck. They can go for the puck or hit the defensemen – always going for the puck would let the defenseman become too comfortable, letting them make more effective plays, while always hitting would take them out of the play too often, conceding too much ice after a simple pass. The optimal strategy is likely randomising, say, hitting 20% of the time factoring in gap, score, personnel, etc.

A More Robust (& Strategic) Approach

Even if it seems a purely analytic-driven strategy is difficult to conceive, there is an opportunity to take advantage of this knowledge. Time is a more robust test of on-ice strategies than p-values. Good strategies will survive and replicate, poor ones will (eventually and painfully) die off. Innovative ideas can be sourced from anywhere and employed in minor-pro affiliates where the strategies effects can be quantified in a more controlled environment. Each organisation has hundreds of games a year in their control and can observe many more. Understanding that building an analytical case for a strategy may be difficult (coaches are normally sceptical of data, maybe intuitively for the reasons above), analysts can sell the merit of experimenting and measuring, giving the coach major ownership of what is tested. After all, it pays to be first in a dynamic game such as hockey. Bobby Orr changed the way the blueliners played. New blocking tactics (and equipment) lead to improved goaltending. Hall-of-Fame forward Sergei Fedorov was a terrific defenseman on some of the best teams of the modern era.[3]  Teams will benefit from being the first to employ (good) strategies that other teams don’t see consistently and don’t devote considerable time preparing for.
The game can also improve using this framework. If leagues want to encourage goal scoring, they should encourage new tactics by incentivising goals. I would argue that the best and most sustainable way to increasing goal scoring would be to award AHL teams 3 points for scoring 5 goals in a win. This will encourage offensive innovation and heuristics that would eventually filter up to the NHL level. Smaller equipment or big nets are susceptible to second order effects. For example, good teams may slow down the game when leading (since the value of a marginal goal for is now worth less than a marginal goal against) making the on-ice even less exciting. Incentives and innovation work better than micro-managing.

In Sum

The primary role of analytics in sport and business is to deliver actionable insights using the tools are their disposal, whether is statistics, math, logic, or whatever. With current data, it is easier for analysts to observe results than to formulate superior on-ice strategies. Instead of struggling to capture the effect of strategy in biased data, they should using this to their advantage and look at these opportunities through the prism of game theory: testing and measuring and let the best strategies bubble to the top. Even the best analysis might fail to pick up on some second order effect, but thousands of shifts are less likely to be fooled. The data is too limited in many ways to create paint the complete picture. A great analogy came from football (soccer) analyst Marek Kwiatkowski:

Almost the entire conceptual arsenal that we use today to describe and study football consists of on-the-ball event types, that is to say it maps directly to raw data. We speak of “tackles” and “aerial duels” and “big chances” without pausing to consider whether they are the appropriate unit of analysis. I believe that they are not. That is not to say that the events are not real; but they are merely side effects of a complex and fluid process that is football, and in isolation carry little information about its true nature. To focus on them then is to watch the train passing by looking at the sparks it sets off on the rails.

Hopefully, there will soon be a time where every event is recorded, and in-depth analysis can capture everything necessary to isolate things like specific goalie weaknesses, optimal powerplay strategy, or best practices on the forecheck. Until then there are underlying forces at work that will escape the detection. But it’s not all bad news, the best strategy is to innovate and measure. This may not be groundbreaking to the many innovative hockey coaches out there but can help focus the smart analyst, delivering something actionable.

____________________________________________

 

[1] Is hockey a simple or complex system? When I think about hockey and how to best measure it, this is a troubling question I keep coming back to. A simple system has a modest amount of interacting components and they have clear relationships to other components: say, when you are trailing in a game, you are more likely to out-shoot the other team than you would otherwise. A complex system has a large number of interacting pieces that may combine to make these relationships non-linear and difficult to model or quantify. Say, when you are trailing the pressure you generate will be a function of time left in the game, respective coaching strategies, respective talent gaps, whether the home team is line matching (presumably to their favor), in-game injuries or penalties (permanent or temporary), whether one or both teams are playing on short rest, cumulative impact of physical play against each team, ice conditions, and so on.

Fortunately, statistics are such a powerful tool because a lot of these micro-variables even out over the course of the season, or possibly the game to become net neutral. Students learning about gravitational force don’t need to worry about molecular forces within an object, the system (e.g. block sliding on an incline slope) can separate from the complex and be simplified. Making the right simplifying assumptions we can do the same in hockey, but do so at the risk of losing important information. More convincingly, we can also attempt to build out the entire state-space (e.g different combinations of players on the ice) and using machine learning to find patterns within the features and winning hockey games. This is likely being leveraged internally by teams (who can generate additional data) and/or professional gamblers. However, with machine learning techniques applied there appeared to be a theoretical upper bound of single game prediction, only about 62%. The rest, presumably, is luck. Even if this upper-bound softens with more data, such as biometrics and player tracking, prediction in hockey will still be difficult.

It seems to me that hockey is suspended somewhere between the simple and the complex. On the surface, there’s a veneer of simplicity and familiarity, but perhaps there’s much going on underneath the surface that is important but can’t be quantified properly. On a scale from simple to complex, I think hockey is closer to complex than simple, but not as complex as the stock market, for example, where upside and downside are theoretically unlimited and not bound by the rules of a game or a set amount of time. A hockey game may be 60 on a scale of 0 (simple) to 100 (complex).

[2] Spoiler alert: if you performing the same thought experiment with rock-paper-scissors you arrive at the right answer –  randomise between all 3, each 1/3 of the time – unless you are a master of psychology and can read those around you. This obviously has a closed form solution, but I like visuals better:

[3] This likely speaks more to personnel than tactical, Fedorov could be been peerless. However, I think to football where position changes are more common, i.e. a forgettable college receiver at Stanford switched to defence halfway through his college career and became a top player in the NFL league, Richard Sherman. Julian Edelman was a college quarterback and now a top receiver on the Super Bowl champions. Test and measure.