Long Read: Carolyn Goes Overboard About the Importance of Analytics

If you’re a reader of Two Bearded Ladies, or Defending Big D, or Puck Daddy, or literally any hockey blog on the internet, you’ll have heard the terms Corsi, Fenwick, and Fancy Stats (more accurately known as advanced stats) thrown around a lot.

The hockey world was all abuzz this summer when Kyle Dubas, analytics darling, was hired as the Maple Leafs Assistant GM. Almost every team in the league is talking about Fancy Stats, whether it’s to begrudgingly embrace them like the Leafs, or to question their usefulness, like Brian Burke has.

As a former market analyst and product forecaster for a big tech company, the term ‘data-driven decision making’ was spoken so often that even now, I can’t bring myself to laugh at its buzzwordiness. I’m all-in on analytics as a valid methodology for evaluating circumstances and for attempting to predict future events, whether we’re talking about selling a car or winning a game.

Every day, you’re a beneficiary of analytics whether you know it or not. How do you think that weather app was able to tell you today’s high temp?

Yes, some days it’s wrong and you’ll be grabbing the umbrella and getting only sunshine. Some days you’ll know it’s raining before you even check your phone. Some days you won’t believe the app when you walk outside to birds singing and bright skies, but then sure enough, thunderstorms roll through. Over time, being prepared by your weather app means you’ll get rained on less.

Heavy handed metaphors aside, hockey’s benefitting from analytics in much the same way.

Analytics tell us what we already know.

Only now we know we know it.

For instance, watching Sidney Crosby play is enough to tell you that he’s better than good. In fact, I doubt there’s anyone out there who will deny that he’s an elite player (and if there are, they probably live in Philly).

But fancy stats give you a way to measure the difference between him and other players. The most popular metric used by hockey analysts is Corsi For, a measure of puck possession. Since this article is not intended to be a primer on the stats themselves, I’ve linked several excellent primers and websites at the end.

Since the 09/10 season, Crosby’s Corsi For % hasn’t dipped below 50%. Taking a straight average of his entire career is 53%. For comparison, his partner in crime, Evgeni Malkin’s career Corsi For % is 52.3%. That’s only a 1% difference, so instead it’s helpful to look at their Corsi Relative %, which measures a skater’s Corsi For, relative to the team, aka how much a player helps or hurts possession. Over his career thus far, Crosby is rocking a 5.44% Corsi Rel %. Malkin is only at a 3.53%.

We already knew that Crosby was a great help to his team. But now we can quantify that, and use it as a yardstick for other elite players.

Not so much a numbers person? Analytics can help you with that. This graph shows all the Pittsburgh forwards with >600 minutes of ice time.

Y axis is Corsi per 60 minutes (aka the expected number of Corsi events throughout an entire game), X axis is total minutes played. The more interesting context, though, is color (Corsi Rel %, as defined above), and size, which is on-ice goal differential (Goals Scored when player is on the ice minus goals scored without that player).

There are a couple of players who have better possession numbers than Crosby, and Malkin has played more career minutes for the Pens, but Crosby not only has one of the higher Corsi/60, he has a high CR%, and the largest Goal Differential. That is what it means to be elite.

Analytics are also helpful when trying to contextualize things like the Hart Trophy, given to the player judged most valuable to his team. Last season (13/14), Crosby brought home three trophies, including the Hart. The Philly fans were especially salty over that one, as Claude Giroux was also nominated, presumably for single-handedly dragging the Flyers into the playoffs.

I submit to you the following evidence of why Crosby won the Hart over Giroux. Please note that on these graphs, Red is good and Blue is Bad.

First, Giroux:

Wow, that’s a big difference in shot rate with and without him! Obviously a deserving candidate.

But, here’s Crosby:

Enough said.

Again, watching these guys play is enough to tell us how talented they are. Nothing compares to the thrill of watching Crosby on a beautiful breakaway.

But it’s one thing to know he’s good, and another thing to know exactly how good.

Analytics challenge conventional wisdom.

One of my favorite things about analytics is that while you generally look at the root cause of events with a hypothesis in mind, the math doesn’t care what you believe.

For instance, how many times have we heard commentators go on and on about “physicality” and “really needing to follow through on hits” and how “hits can change the momentum of a game.”

Well, Jen Lute Costella (@RegressedPDO) recently took a look at this widely held assumption through the lens of analytics.

The TL;DR point of her article is that there is no statistical correlation between hitting and shot suppression nor between hitting and puck possession. Hitting sticks around for hitting’s sake.

Pro-hitting fans would sneer at this data because it’s easy to go on YouTube and find video of a big hit causing a turnover leading to a scoring chance. It does happen. But to use this as the only proof of hitting’s usefulness is to be a victim of a psychological phenomenon known as the “availability heuristic” – the tendency to look at only the examples that come immediately to mind when discussing a topic. It’s going to be a lot harder to find videos of “Checking Lines” going out for a shift and coming up empty handed, because, well, no one wants to watch that.

Now, analytics isn’t saying that hitting is never effective, especially in a game by game scenario, but what does say is that over time, there is no discernible benefit from having a higher hit rate.

And that’s the great thing about analytics; it gives you a framework in which you can evaluate your own longstanding beliefs with more than just anecdotal evidence.

Analytics Put the Educated into Educated Guess.

For most of its detractors though, the real litmus test of all this analysis is predictive ability. For them, analytics are a waste of time, because you can watch a game and know who the better team is, and usually the better team wins. No one can predict lucky bounces. No one can predict game-changing injuries.

And they’re right. If you’re a betting man and you’re using Corsi as your yardstick for success – you’re probably going to lose a lot of money. But when you look at analytics as a tool of the coaching staff, and probably more importantly, the GM, its value is immense.

For instance, Stan Bowman, GM of the Chicago Blackhawks, has mentioned he’s used proprietary analytics for years. The Blackhawks are consistently in the tops of possession charts, usually have one of the lowest Fenwick Against rates, and generate some of the scariest offense in the league. Obviously, I’m not a part of Stan’s staff, so I don’t know what he’s looking at statistically, but when he goes to the draft, I would bet my life he knows exactly what kind of player he wants, and it doesn’t have a lot to do with “intangibles.”

I actually hate that word, “intangibles.” The argument is that things like “leadership” and “competitiveness” and “toughness” can’t be measured. That’s bullshit.

No, you aren’t going to go through the NHL and ask each player to rank each other’s leadership ability on a 10 point scale, but you can measure behavior. Good leaders stay calm under pressure – and we do measure turnovers in close game situations, and penalties taken. Good leaders make their teammates better – and we already measure quality of teams (and teammates) with or without a player in several different metrics. Good leaders earn respect by doing – and we already know how to rank players in every metric known to man.

When he drafted him, Stan Bowman couldn’t have predicted that Jonathan Toews would be the captain that would lead the Blackhawks to two Stanley Cups in four years, but he knew he was getting a possession-first two way center with maturity beyond that of an average eighteen-year-old.

Bowman built the team around Toews and Kane and their speedy style of possession-based play. There aren’t many “power forwards” on the Blackhawks, and it’s because big men have a harder time keeping up. Even the defense on the Blackhawks is fast and better than average in possession.  This is deliberate. Bowman set up a system through smart drafting and sticking to a plan over time.

Have I mentioned the “over time” bit enough yet? Because no, having a good grasp on metrics is not a magic cure-all that suddenly means a team wins every game.

If it did, the Blackhawks would’ve won the cup every year, and clearly, that’s not the case. But by continuing to draft to the system, to coach to the system, to augment the core by picking up trades that work within the system over time, it has paid off. Twice.

For comparison, the Edmonton Oilers have not one, but three franchise-able players. Taylor Hall, in particular, can be one of the most dynamic players in the game. Yet, the continued anguish of the Oilers faithful is not about to end anytime soon.

Who’s fault is this? The leaky defense? The shoddy goaltending? The over-hyped first round draft picks? The multiple coaches over the last few years? The multiple GMs?

It’s plenty clear that the Oilers management hasn’t had a clear plan. They have no specific style of play (are they a hitting team? are they a fast team?), they seem to go for “best available” when drafting (instead of looking at where a player would fit in with the admittedly nonexistent system), and of course, they throw around money when the ROI isn’t the greatest.

So it shouldn’t be surprising when I say that the Oilers hired their first analytics guy this summer. Godspeed good sir, and I hope management listens when you talk about results over time.

Jim Nill has gone on record saying he’s a big supporter of the analytics movement, and even though the Stars haven’t hired anyone specifically to run a team of analysts, they do use the numbers as ‘coaching opportunities’ especially for their third and fourth line forwards. It’s going to be interesting seeing if that continues to be how the organization treats analytics or if their needs will grow as the field develops. Either way, as one of the so called stat-nerds, it’s gratifying to support a team that finds this as important as I do.

What I’m saying is this: If you don’t care about fancy stats, that’s cool. Not every fan needs to be a part-time analyst. But it’s hard to deny that analytics have a place in hockey, especially in the back offices and with the coaching staff.

And for some of us, understanding the why behind the what makes the game even more fun.

Looking for more information on fancy stats? Here are some excellent resources:

The Second City Hockey Stat Primer:

http://www.secondcityhockey.com/2013/12/4/5167404/nhl-stats-made-simple-part-1-corsi-fenwick

The Sports Illustrated Stat Primer:

http://www.si.com/nhl/2014/09/28/fancy-stats-primer-advanced-analytics-corsi-fenwick-pdo-qualcomp

Stats Databases:

http://war-on-ice.com/ (my personal favorite especially for 3-and-4 dimension graphs, though the terminology and user interface can be confusing)

http://www.hockey-reference.com/  (straight forward tables of seasons’ worth of data)

http://stats.hockeyanalysis.com/ (One of the only sites that gives you With Or Without You data)

Advertisements

2 thoughts on “Long Read: Carolyn Goes Overboard About the Importance of Analytics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s