Hi. How are you? I hope you’re doing good. On my side, I won’t lie here, life has been a rollercoaster. Lately, I have been shaken by some unexpected bad news, but we have to keep living because better things and moments are coming ahead. On another note, I am doing good. I had the opportunity to listen to the OTTHAC22 (Hockey Analytics Conference in Ottawa) where I learned a lot and the team when I am an intern said they will need me for this summer (we were in the uncertainty since the end of their season a couple of weeks ago). Also, I have started working on a few projects that should be available to the public in a couple of months. The first project is building a tool to evaluate prospects for the 2022 NHL Draft [doing my best to avoid saying NHLe because I know that a lot of people fear that word, but don’t worry, just like everyone I will prepare different things that will allow a potential user to understand how the tool was built (with all its strengths and limitations) and how we should use it] and the second one is just a project I don’t know about yet… I just know I have 60 days to submit it for the Big Data Cup (and it will be the 1st time I ever have to play with proprietary and women’s hockey data).
In our short time as SC Bern interns, we (because we are 2 friends who got hired for the internship, not only me) were asked to look at the odds of winning on back-to-back games in different contexts (mainly venue related)… and it was a fun project to work on. As much as I didn’t learn anything related to expected goal, it was still my first experience working for a coach and I really enjoyed it. I can’t wait for the summer to start because I am addicted to do hockey analytics work (that’s the thing I enjoy the most in life).
My intro was a little bit long, but if you are still with me here, I will finally explain my understanding of some basic concepts of hockey analytics.
Why do I want to do that?
Some analytics concepts are now part of my normal vocabulary and it is super annoying to have to re-explain the most basic concepts for the 1,000,000th time. I will if I have to, but at some point, it is repetitive and it is not fun for me. It is nothing personal, it might just make me less interested in talking about analytics with you.
One really good way to learn and understand something is to explain it to other people. So, here am I. I have the idea to make a series of videos on Youtube about it… but the idea will stay an idea because I think I still need more knowledge until I start explaining things I don’t necessarily understand fully to a general public. I need to be 110% sure I understand the topic before I jump in those kind of projects. For the french speaking people reading this, I did not forget about you : most of the hockey analytics work is in english and to democratize it, it is important to make the field available in both languages, so it is also part of my goals to do this.
I really want to help. If you have some idea of how the concepts work and you would benificate from my perspective on it, the article is THE ARTICLE YOU NEED [with the Shamwow! guy voice].
With further ado, let’s start.
Corsi
I have to start with the grandpa of hockey analytics as we know today. I don’t have to tell you that it started by a coach keeping track of shot attempts and few years later, the 1st advanced stat has his name, but here’s the story.
Definition : Shot attempt… yes, it’s that simple.
Like all others non-baked stats (like my friend Waveintel likes to call them), THEY WILL NOT TELL YOU IF YOUR FAVOURITE PLAYER IS GOOD OR NOT. I will say it now so I don’t have to say it again : for player evaluation, using one stat is far from enough. Corsi will give you a good indication of how the flow of the game. Seeing a team that has a high (over 55%) Corsi% (CF%) during a game indicates that the team you are looking at had more shot attempts than the opposing team (and you can make your own conclusion with that and other data). Corsi is known as a possession metric. The team with the highest Corsi most shot attempts had more of the possession of the puck, and it is an indication that the team maybe played better than the opposing team. A team that has a high Corsi% (CF% = 100 x (CF/CA)) during a season has outshot-attempted his opponents during a season, and it’s a good information to have in mind. In the opposing scenario where your team has a Corsi% of 33% after 40 games, you might keep tanking or make urgent changes.
Another way to use it is see the Corsi when a specific player is on the ice. If your favourite player Woumaxx has a CF% of 69% against the Maple Leafs… you will be happy because when he was on the ice, the Canadiens have outshot-attempted the Leafs. But if another player Jeffrey Pourry has a CF% of 12% against the same Leafs, it should be alarming and you, as a coach or gm or twitter superfan, will need to find out why the Canadiens got cooked when Pourry was on the ice.
One of the really cool things about Corsi is that it’s one of the good predictors we have for future scoring.
In a lot of cases, Corsi is subject to adjustments like a lot of shooting stats. The venue (home team has an edge on the away team) and the score (teams trailing shoot more than the lead with a lead) have an impact on a lot of metrics and in order to be fair and evaluate players and teams equally, we apply adjustments (and we often evaluate how teams did at 5on5) that remove the effect of the score and the venue.
When you scrape the NHL play-by-play data, shot attempts (or corsi) include blocked shots, shots on goal, missed and goals.
The code I’m writing in Python to select the Corsi events is
corsi = game.loc[game.events.isin([“Block”, “Shot”, “Miss”, “Goal”])]
The main thing you need to remember about Corsi is that it’s a metric of the shot attempts in a game. It is different from the the famous shot on goal metric because shot on goal as known universally only include Saves + Goals and don’t include missed and blocked shots.
I am a little bit tired of writing, so I will stop for now, but if you like it, let me know and I will do deeper on other stats.
Have a good night.
yo fr i think it would be nice if you did this for other stats. I feel like it could be a nice way for people on Habs twitter that fw analytics (or that are actually one of the rare people who are genuinely open to learning about them), but who don't exactly get them, to understand the basics a bit more (maybe just the basic shit you see on player cards/graphs and whatnot would be nice idk lol)
Jeffrey Pourry 💀💀💀