After completing my first NHLe model earlier this summer, I was in a quest for a new challenge.
I wanted to focus on doing something useful. With no background in Statistics, Mathematics or Engineering, it would be a really big challenge for me to build my own Expected Goals model, but that’s what I ended up doing. Part of the reason that pushed me to work on that project is that I already understood the concept of xG, but I wanted to understand it more, and there is no better way of understanding xG than building my own xG model. Not only I can explain to people how we come up with xG, but I feel that I am more aware of how it is done and this makes me in a better situation to use it. Now, I understand better the limitations of Expected Goals and its strengths. Expected Goals is a big concept in the “hockey analytics world”. It gives us an idea of the shot quality and it is one of the fundamental concepts behind other much more complicated but still famous hockey analytics tools like Goals Above Replacement (GAR/WAR/SPAR) and RAPM charts (and Hockey Viz’s charts).


Let’s talk about Breezy!
I called my model Breezy because it’s a cool name and it reminds me someone I like a lot.
Breezy is a Logistic Regression Model that uses about 15 features per shot to determine their probability of turning into a goal. Among the important feature I take into consideration, the shot distance is by far the most important feature.
I have picked the Logistic Regression Model after reading a lot about it. It was by far the one I understood the most and after many discussions with people from the field of sports analytics, it was a good one to choose for my first model. I definitely consider trying other models once I understand them better. I know that a lot of hockey xG models use XGBoost instead of the Logistic regression I use and it intrigues me a lot… but I’m new to models and I hope I will understand them better as soon as possible.
I did not build the model out of nothing. I have found a lot of documentation online on how previous xG models were built. As much as I take pride in building the model, I would not have come to the end product without the previous work from Harry Shomer, Matthew Barlowe, the twins from Evolving-Hockey, Patrick Bacon, Micah Blake McCurdy, HockeySkytte, and many more and all the people that inspired them to come up with their own xG models (like Asmae Toumi who built the really first hockey xG model or one of the firsts if my Hockey Analytics History knowledge is good, and Emmanuel Perry or mannyelk who pushed hockey analytics to another level with corsica).
Here’s how my model performs.
0.74…. It is definitely not amazing… I know I’m not a game changer in the analytics community… I probably have one of the worst xG models out there, but I am proud of how the project turned out. I trained the model using 5 seasons worth of shots. Even if it’s far from perfect, the project is completed and from here, I can only improve it and learn as much as I can to improve the performances of my model. Since the model is performing okay, I can also use it to work on other tools that might need the knowledge I got from building the xG model or tools that might need the model.
Actually, I have already started building tools that require an xG model. I have opened a Tableau file where people can play with a bunch of features and sheets to generate the graph they need. It is mostly about individual goals and individual expected goals at all strengths during the 2021-2022 season. The link is HERE, it does not say much, but it is still a good learning experience of me and a nice toy for you.
After playing with the filters or one of my Tableau sheets, I found out that Cole Caufield was one of the most dangerous offensive players among the players born after 1999.
What are the next moves for @woumaxx?
Unfortunately, the love story between me and the SC Bern (NL) is over and I am now a free agent. As I am still at University and I am not super qualified, I’m not looking for a full-time job in the hockey analytics field, but I am open for any opportunities of learning with a sports team for next season. Paid opportunity, not paid opportunity, men’s hockey or not, I’m open to anything. I want to learn and I want to help a team to win. Even if it’s a sports media, I’m willing to put my own blog on pause to vulgarize what I know of hockey analytics concepts in english or french on yours blogs or your podcasts. I won’t cry if I don’t end up with an opportunity. I have a BIG list of projects and research I want to work on, so you will still me analyticsing on twitter. I say the W’s, I also have to say the L’s : I have applied to 2 or 3 sports analytics internships for the summer 2022 and I was selected for none. I still really enjoyed the interviews and the challenges (homeworks) I was given in order to prove my worth. You will see my name again ;) !
Can’t end the article without thanking my friends and all the people from the sports analytics field for supporting me and offering me help when I need.
See yaaaaa!!!