The Leicester Problem: Expected Goals, Honestly Explained

Leicester City won the 2015-16 Premier League with an expected goals total that, at the time, suggested they were maybe the fourth or fifth best team in the league. The xG model thought they should have been finishing around 65 points. They finished with 81. A trophy went up. Pundits called it a fairy tale. The xG community quietly noted, in their corner of Twitter, that the model had been correct about the underlying performance and that Leicester would probably regress sharply the following season. The next year, Leicester finished 12th.

This is the trick with expected goals. The metric does not predict trophies. It describes the work that produces them, and the work that does not. When the work and the result disagree, the result usually catches up. The fairy tale was real. So was the regression.

If you have read about xG and walked away feeling like the people who use it are smug or that it cheapens what you watched, the rest of this article is going to try to win you back. Expected goals, used well, is the most honest metric in mainstream football coverage. Used badly, it is a way to tell people who watched the match that they did not understand it. Both versions are common.

What expected goals actually measures

Every shot taken in a professional match is, at the moment the ball leaves the foot, defined by a set of features. Distance to goal. Angle. Body part used. Pass leading to the shot (cross, through ball, cutback, individual carry). Defensive pressure. Whether the shot is from open play, a set piece, or a penalty. Some models add more — pre-shot goalkeeper position, footedness of the shooter, big chance flags.

An xG model takes those features, runs them through training data from tens of thousands of historical shots, and outputs a probability that the shot will be scored. A penalty is roughly 0.76 xG. A header from twelve yards under pressure is roughly 0.15. A weak left-footed effort from twenty-five yards is roughly 0.03. Add them up over a match and you get an xG total — the expected number of goals from the chances created, regardless of who was in goal or how clinical the finishing was that day.

The intuition is simple. If your team generates four 0.25 xG chances and the opponent generates one 0.30 xG chance, you “deserved” to win even if you lost 1-0 to a deflection. Over ten matches, the team consistently generating better xG will probably score more goals than the team consistently generating worse xG. The metric is, at heart, a way to separate process from outcome.

The cleanest uses of xG

A few applications where the metric pays its rent.

Identifying overperformance and underperformance

Borussia Dortmund’s 2023-24 league campaign is a textbook example. Their xG profile suggested a top-three finish. They finished fifth, lost to Real Madrid in the Champions League final, and changed managers. The shot they were generating — and conceding — was, on the public data, better than the table suggested. The next season they recruited differently. The underlying numbers were a leading indicator.

At the player level, the same logic applies. A striker scoring eight goals from 14.5 xG over a half-season is, on average, going to revert toward the underlying number. A striker scoring 18 goals from 9.0 xG is, on average, going to come back to earth. Both moves happen, in both directions, and the player who looks “in form” in November sometimes turns out to have been the bag for the next six months.

Evaluating chance creators

Expected assists (xA) extend the same logic. A midfielder generating 0.40 xA per 90 is producing the kind of passes that, on average, lead to high-quality chances. Whether the strikers in front of him convert is a different question. FBref and StatsBomb both publish per-90 chance creation data that, over a full season, gives you a more reliable read on a creative player’s contribution than assists alone.

Scouting on small samples

Forty-five minutes of Bundesliga 2 is not enough to evaluate a striker on goals. It is, occasionally, enough to evaluate him on xG per 90 if his shot profile is striking enough. Recruitment departments use this aggressively. The public version of the data lags but, season-on-season, has caught up enough that you can do a rough job at home.

Where xG breaks

The metric is not magic. It has known failure modes, and the writers who use xG well are usually the ones who name those failures out loud.

Big chance bias. Some models treat “big chances” — the StatsBomb data flag — differently than open-data shot quality. Two models can value the same chance at 0.25 and 0.45 depending on whether the shot was tagged as a clear opportunity. Always check which xG you are looking at. Public data from FBref, Understat, and Opta-derived sources can disagree on the same match by several goals over a season.

Goalkeeper effects are absent. The standard xG model does not include the position of the goalkeeper at the moment of release. An xGOT (expected goals on target) model fixes that by valuing the placement of the shot. Both metrics together tell a fuller story than either alone. Reading xG without xGOT can sometimes credit a striker who placed the ball perfectly with the same value as a striker who hit the keeper in the chest.

Penalty distortion. A penalty is worth 0.76 xG, give or take. A team that draws three penalties in a match has an xG total inflated in a way that is not really repeatable. Most serious xG analysis presents npxG — non-penalty xG — for exactly this reason. If a writer is comparing strikers and not separating penalty goals out, the comparison is broken.

Game state matters. A team that goes up 2-0 in the 30th minute and parks the bus for the next hour will produce a lower xG total than they “should” for their quality. Reverse for the team chasing. Match xG is most useful when paired with the score state at the time the shots were taken.

Small samples lie loudly. A striker with 1.4 xG from four matches has not told you anything. The xG of a 90-minute match is, in isolation, mostly noise. The metric stabilizes over twenty-plus matches for a team and longer for individual finishers. Treat single-match xG the way you would treat a single coin flip: data, but not yet evidence.

The Leicester problem, and the limits of regression

Back to the 2015-16 Foxes. The model was correct that their underlying numbers did not support a title-winning season. The model was also wrong, in a deeper way, about how long the gap between underlying numbers and results could persist before reality reasserted itself. Eighty-one points is not a fluke. It is a club playing above its underlying level for nine consecutive months. The xG community sometimes treats that gap as a kind of debt that must be repaid, and the truth is that some teams hold it open longer than others and a few small clubs hold it open long enough to win a trophy.

This is the philosophical limit of expected metrics. They are descriptions of average outcomes given average finishing, average goalkeeping, and average luck. A team that, for structural reasons — a transcendent goalkeeper, a striker on the run of his life, a defense that consistently forces shots into the corners of the box even when xG calls them average — outperforms its underlying numbers can do so for a season or longer. The model is not wrong. The model is also not telling the whole story.

The honest version of xG analysis acknowledges both halves. Underlying numbers usually win in the long run. The long run, in football, can be longer than a season.

How to read xG in a match report

A short field guide for reading xG numbers in a piece of football writing without being either credulous or contrarian.

Check the source. Understat, FBref (which uses StatsBomb’s data), and Opta will sometimes show the same match with different xG totals. The model architecture differs.
Look for npxG. If penalties are included without comment, ask why.
Read the shot map, not just the total. Twenty shots from outside the box can produce an inflated-looking xG total that does not represent serious threat.
Anchor to the season, not the match. A single 90 has too much variance to argue from. Ten matches starts to mean something.
Watch the games. The metric is a translation of what you saw, not a replacement for seeing it.

What I tell people at parties

If a stranger asks what xG is, I usually say: it is the metric that explains why the team that played better sometimes loses, and the team that played worse sometimes wins, and tells you which one of those is going to keep happening. Most football conversations get easier when you have that idea in your back pocket. The other team got lucky becomes a defensible claim if you can point at a shot map. The fairy tale becomes a more interesting story when you can describe what made it improbable.

You do not need to be the person at the bar with the laptop. You do need to know which side of the metric the team you support has been living on. That, more than anything else, is what xG was built to tell you. For the basketball-equivalent conversation, our NBA advanced stats guide covers the same territory in a different sport, and our analytics primer sets the wider frame.

The Leicester model is still the lesson. The model was right about the work. The result took an extra season to agree.

The Leicester Problem: Expected Goals, Honestly Explained

What expected goals actually measures

The cleanest uses of xG

Identifying overperformance and underperformance

Evaluating chance creators

Scouting on small samples

Where xG breaks

The Leicester problem, and the limits of regression

How to read xG in a match report

What I tell people at parties

Beats

Newsroom

What expected goals actually measures

The cleanest uses of xG

Identifying overperformance and underperformance

Evaluating chance creators

Scouting on small samples

Where xG breaks

The Leicester problem, and the limits of regression

How to read xG in a match report

What I tell people at parties

Related Posts