As my first full year as a part of the hockey analytics community wraps up, I was extended the opportunity to review Rob Vollman's latest installation in the Hockey Abstract series. Rebranded as Stat Shot for the fourth installation, Vollman and colleagues Iain Fyffe and Tom Awad attempted to tackle several important questions, including how to properly construct a hockey team and how to project a junior player's statistics. What follows is my review of their work.
As a relative newcomer to the hockey analytics world, I found myself in a unique position to review their work. For me, the most important pieces of this book are the in-text references. Vollman and colleagues do a superb job of referencing the work of others to build a historical timeline of the work done on specific analytics questions. This historical context allows the reader to understand where we are and how we got there. An excellent example of this is with respect to the discussion on whether or not the amount of shots a goaltender sees impacts their ability to stop the puck. Vollman and colleagues reference more than 10 independent studies that ultimately lead to the conclusion we hold today. For those interested in learning more about hockey analytics, a thorough understanding of the work that has been done is imperative. Stat Shot provides the necessary historical context and would be a great reference to anyone looking to learn more about hockey analytics.
Another wonderful feature of Stat Shot is one that is tough to explain. For fans less familiar with hockey analytics, save percentage represents the percentage of shots a goaltender stops. When we talk about the leaders in save percentage, the conversation usually begins and ends with the raw number reported. Ben Bishop is good because his save percentage was .926 SV%. Steve Mason is only average because his save percentage is .918 SV%. In Stat Shot, Vollman and colleagues do a superb job of engaging readers in the types of conversations that need to be had when we discuss statistics. What contextual variables need to be addressed when discussing a metric? Is the metric repeatable? How well does it explain what we are observing? These are questions that Stat Shot consistently addresses at a level that is easy to understand for all readers. Bringing it back to goaltending, what percentage of shots came from dangerous areas? For that matter, how do we define a dangerous shot? How much time did a goaltender spend shorthanded? Challenging yourself to think in that mindset is one of the tougher things to do, especially if you are not accustomed to questioning everything that is presented to you. Stat Shot shows readers how to do that and more importantly, how to do that well.
Outside of hockey, Stat Shot also does a wonderful job addressing important statistical concepts that are frequently used in the hockey analytics community. There is a brief tutorial on how to build a statistical model, how and when to regress a variable, how to assess for random variation using even-odd split-half methodology, and an explanation of how to use the r and r^2 coefficients. Vollman and colleagues do an excellent job of explaining all of these concepts in plain language that allows the reader to easily grasp what the statistic is and why it is being used.
Approaching this book with more of a statistical background, I did have a few minor gripes. Most notably, there were several scatter plots that did not publish or report the r and r^2 values but commented on the positive or negative correlation. While this may seem like a relatively minor criticism, it is imperative that researchers publish the r and r^2 values for their scatter plots so that everyone understands the true correlation and explanation of variance. Try this game out and see how many you can correctly guess. Maybe I'm just bad at the game but my high score was 118 and I missed some r values by more than 0.3. Ultimately, I'm skeptical of any researcher that publishes a scatter plot without an r and/or r^2 coefficient and comments on the positive or negative correlation.
The other challenge I would pose is with respect to the chapter on the "Projectinator" and projecting how junior players will perform in the NHL. I'm not so sure that subjective information such as a prospect ranking list or World Junior Championship (WJC) team nomination should be included in a model that's designed to generate an objective evaluation. For me, it would be better to design the model using objective measures and separately discuss the merits of the CSS ranking list and WJC nomination. Additionally, the best models include variables that have been tested for predictive value and found to be worthwhile to include. By including adjustments for a player's size and subjective measures such as the CSS ranking list or WJC nomination, i'm not certain that the model meets its aim of being as comprehensive and predictive as possible.
Overall, this book is a must read for fans looking to dive into the world of hockey analytics as it provides a great historical overview of the work that has been done and challenges fans to think contextually when evaluating statistics. It also provides a quick refresher on some of the basic statistical tests used and a brief explanation on why they are important. Those with more experience in hockey analytics will appreciate the exhaustive in-text referencing but may come away with several methodology questions. Ultimately, this book far surpasses the needs of the target audience and I would highly recommend this book to anyone looking to get started in hockey analytics. For those with more experience in hockey analytics, Stat Shot's historical timeline and "next questions to answer" may still make this a useful purchase. Overall, this is a worthwhile read for most everyone interested in hockey analytics.