Advanced Stats for Non-Stats People: Article One - Goals Above Replacement

Connor McDavid led all players with a SPAR of 9.2. If you don’t know what that means, you should by the end of this article.


For anyone who reads my writing here, you know that I will often use “analytics” or “advanced stats” in my articles. Often, people have questions, many of which take longer to answer than is efficient to do in comments.

So I decided to create a series in which I attempt to explain these concepts to those of you who aren’t very interested in statistics. The “attempt” here is applied to me, not the reader, as it’s my responsibility to make it understandable.

I’m going to start off with Goals Above Replacement, which is often used to argue the value of different players, either individually or when compared to each other.

GAR - Goals Above Replacement

The idea here is to attempt to quantify a player’s value to his team. I use the word “attempt” because it’s something that can be improved over time with more information. The eye test is also an attempt at analysis. Using publicly available data (for public models), the model breaks down player performance into several categories.

Evolving Hockey’s model uses the following categories:

Even Strength Offense
Even Strength Defense
Power Play Offense
Short Handed Defense
Penalties Taken
Penalties Drawn

(Their site has writeups on their model that go into MUCH more detail than I do here.

For each of the categories, the number you see is what the model calculates as the value of that category. So if a player has a +1 for Penalties Taken, that doesn’t mean they took 1 penalty. It means that the value calculated is 1. What that “1” means depends on what number you are looking at.

What I mean by that is that you can look at the output as WAR (Wins Above Replacement), GAR (Goals Above Replacement) or SPAR (Standing Points Above Replacement). I currently think that SPAR is the best lens to view these numbers through because it’s the easiest to understand. With the loser point, I think that SPAR is slightly better than WAR. It’s important to understand that the each of those three numbers are based on the same model output; the difference is how they are presented to you. So as long as you are looking at the same metric for each player, the comparisons will be the same.

Often people look at GAR because the numbers are higher, making it easier to see the difference. For example:

Comparing Aaron Ekblad’s numbers to Adam Boqvist (the first two players that come up alphabetically on the default list)

Ekblad: GAR 2.5, WAR 0.4, SPAR 0.8
Boqvist: GAR 2.1, WAR 0.4, SPAR 0.7

Hopefully that illustrates my point. The downside to using GAR in my opinion is that it’s hard to conceptualize.

Let’s take two hypothetical players: Player A has 3.4 GAR and Player B has 1, what the hell does that MEAN? That’s why I personally like using SPAR. If Player A had 3.4 SPAR, that would show that they contributed 3.4 standing points to his team compared to a replacement level player, while if Player B had 1 SPAR, they contributed 1 standing point more than a replacement level player would have been expected to.

Common Questions/Misunderstandings

To answer a few common misunderstandings that are frequently brought up, these models attempt to account for teammates and competitions. People often say that you can’t trust these numbers because players play on different teams. The people making the models know that, and measures to account for that, and many other factors, are built into the model.

There is a danger in just looking at something like SPAR and using that to say that is the value of a player. No model is going to be perfect, and because GAR is cumulative, players who play more games have more opportunities to make a positive impact (vice versa for a negative one). To combat that, using a rate version can be helpful. So if you see something like SPAR/60, that means that the number you see is a player’s Standing Points Above Replacement impact per 60 minutes of ice time.

You still have to be careful because if a player plays five games and has a SPAR/60 that’s really high, that doesn’t mean that player will be a superstar. As with every statistic, a quality evaluation of a player will consist of different inputs. Something like SPAR can be helpful because it gives you a quick comparison for players, but in order to make a quality evaluation of a player, you should look further than just the one number.

Did the player get injured in game 30 and not play the rest of the season? You should probably look into the injury and take it into account when considering what their future play may be. Where are they on the standard aging curve? Do they seem to be a player that may defy the standard aging curve? Why? Is that an objective idea, or are you hoping they will as a fan? Does their SPAR surprise you? How come? Take a look at the more detailed breakdown and find where the numbers don’t fit what you thought they would. There could be a reason why the model is telling you something different than you expect. The model could be miscalculating that aspect of the player’s impact.

It could also be that your understanding of a player’s value from watching them could be wrong. It’s important to keep an open mind in both regards. This is something I have to continuously remind myself, so please don’t think I’m only applying that statement to readers or commenters.

I personally think that the numbers from this past season should be taken with a large grain of salt, especially for negative numbers. The pandemic led to player’s playing under much different conditions from previous seasons, and I think that should be considered. There was a very real negative impact of the isolation players played under, at least I believe so from comments made by many players.

Lastly, “what does replacement level mean?”

From the Evolving Wild writeup on their GAR model:

That said, it might be hard to understand why we use replacement level and not average as our baseline – the concept of “average” is very common and well understood by most. But the issue within evaluation here is that an “average” player is valuable; when we baseline our metric around an average player, we lose an ability to apply meaningful context to our evaluation. Replacement level, on the other hand, sets the baseline lower – the baseline now becomes the level at which a player can be determined to be “replaceable” i.e. easily substituted for another readily-available player.

So for hockey, you are talking about the 13th forward or 7th defenseman level of play. That is the simple version. If you want to read more, Part 3 of their writeup goes into more detail.

Another way to think about the replacement level is that it would be the average “next player up” if someone on your team was traded or injured. Some teams have much better depth than others, so it’s not applied on a team-by-team basis. Some teams may be playing with a few (or many) replacement level players in their NHL lineup for different reasons.

Where do I find GAR?

The current best version in my opinion is found at Evolving Hockey. The downside is that some of the site is paywalled. Josh and Luke put a lot of effort into the site, so I feel that the money is worth it. I completely understand, however, that the majority of hockey fans aren’t going to want to pay for that.

Even so, hopefully you will have a better understanding of what people mean when they use the terms GAR, WAR, or SPAR.

If you have questions, please ask in the comments. I may update this article with the questions and answers.

I have some ideas for future articles in this series, but I want to know what you think! Are there any other “advanced stats” questions that I could answer in an article?