Sports Analytics, Honestly: The Working Primer

The first time I sat in front of a Synergy Sports feed, around 2014, I made a confident prediction about the Indiana Pacers and watched it disintegrate inside of two weeks. The pull-up jumpers I had typed up as “sustainable shot diet” started missing. The lineup I had circled as “matchup-proof” lost three home games to teams resting starters. The Bayesian humility that followed has never really left.

That is the strange thing about sports analytics. The numbers will, eventually, make you a better watcher. They will also, on the way there, make you look like an idiot. Anyone selling you a clean version of this story is either trying to sell you a course or has never bet a friend twenty dollars on a player prop and lost.

So here is what this guide actually is: not a glossary, not a brochure, not the talk your uncle gives at Thanksgiving about how analytics ruined the dunk contest. It is the working primer I would have wanted ten years ago, the one that respects that you have already watched a thousand games and can be trusted with a paragraph longer than a tweet.

What “analytics” usually means when smart people say it

The word does a lot of work. In a single NBA broadcast you can hear it used to mean shot selection, lineup data, win probability, player tracking, draft modeling, and, depressingly often, “I read a tweet from Zach Lowe once.” Most of the time, when a writer or analyst uses the word seriously, they mean one of three things.

The first is context-adjusted measurement. Instead of asking how many points a player scored, you ask how many he scored per possession his team used, against which defenses, with which teammates on the floor. Per-game stats are a function of opportunity. Per-possession stats start to look like a function of skill.

The second is expected outcomes. A shot is not just a make or a miss. It is, at the moment of release, a probability. A 24-foot pull-up from the top of the key with a defender within four feet is worth, on average, somewhere around 0.85 points. A wide-open corner three is worth roughly 1.20. The score at the buzzer is the verdict. Expected outcomes are the work being graded.

The third is process versus result. A team that creates 1.18 expected points per possession and loses by twelve has done its job. A team that hits five fluky threes against a switched coverage and wins by one has not solved anything. Analytics, in its most useful form, is a tool for asking which one of those games is going to repeat.

The metrics actually worth your time

Most stat glossaries online will hand you forty acronyms and expect you to remember which one used to be sponsored by which website. You do not need that. You need four families.

Efficiency stats

True shooting percentage. Effective field goal percentage. Goals-per-shot in soccer. These tell you how many points (or goals) a player or team is generating relative to attempts, weighted properly so a three is worth more than a long two. If you only learn one analytics concept, this is the one. A player averaging 22 a game on 56% true shooting and a player averaging 22 a game on 49% true shooting are different players. The box score will not save you here.

Possession and pace

Every team plays a different number of possessions per game. A 110-point game in Sacramento and a 110-point game in Chicago are not the same. Rate stats — points per 100 possessions, per-90 in soccer, per-60 in hockey — flatten that out. If a writer is comparing two teams and never mentions pace, raise an eyebrow.

On/off and lineup data

The plus/minus column in your local box score is noise. Lineup data is signal. When you see a player’s team outscoring opponents by 8 points per 100 possessions when he plays and getting outscored by 4 when he sits, you have something worth investigating. The trap is that lineup data lies in small samples. Two hundred minutes is a hint. Two thousand is an argument.

Expected metrics

Expected goals (xG) in soccer. Expected effective field goal percentage in basketball. Expected wins in football. These models translate the act of taking a shot, attempting a pass, or playing a possession into a probability of a good outcome. They strip away the volatility of the actual result. Used carelessly, they tell you a team that lost was unlucky. Used carefully, they tell you whether that team is generating the kind of looks that will win the next ten games.

Where this gets weird

None of this is clean. Analytics built its early reputation on a few clean stories — Moneyball, the 2014 Spurs, the rise of the three-point shot — and then spent a decade running headfirst into the messy ones. Here is the honest list of places where the numbers will betray you if you trust them too quickly.

Small samples lie loudly. The first ten games of an NBA season can produce a player on pace for an all-time efficient year, ranked alongside prime Curry, who will end up on the bench by April. Ten games is not a season. Five matches is not a Premier League campaign. Public stat databases will happily present that early data as if it were settled. It is not.

Role changes wreck comparisons. A guard who shot 38% from three as a spot-up specialist will probably not shoot 38% as the primary creator. A center who looked elite as a screener can look ordinary as a roll-and-pop hub. The numbers are real. The role they were earned in is also real. Both have to travel together.

Coaching schemes distort signal. Drop coverage versus switch coverage. High line versus low block. Two-back personnel versus empty. Stats describe what happened inside a scheme. They do not always survive the scheme changing.

Injuries hide in the data. Some of the cleanest analytical stories of the last decade are, on rewatch, injury stories. A player’s efficiency cratered. The model said he was done. He was, in fact, playing on a torn ligament. Injury reports are part of stat reading, not adjacent to it.

How to read analytics without becoming insufferable

There is a posture problem in this corner of sports media, and I am not going to pretend I have not contributed to it. The temptation, when you learn a few of these concepts, is to use them to win arguments instead of understand games. The better posture is closer to what you would expect from a beat writer: curious, specific, willing to be wrong out loud.

A short checklist I run before I write a sentence with a number in it.

Is the sample big enough that this number means something? If a player has 14 attempts, the percentage attached to them is barely a number.
Is the comparison fair? Per-game versus per-game is rarely fair. Per-possession or per-90 is.
What is the role context? Is this number being produced inside a usage rate, lineup, or scheme that will survive the next month?
Does the eye test agree, and if not, which one is wrong?
Am I using this stat to teach the reader something, or to win the argument I am already having in my head?

The last question is the one I think most about. Analytics is a translation tool. Used well, it converts a game you watched into a game you understood. Used badly, it converts a game you watched into a tweet.

The reader’s lie detector

The quickest way to know whether an analytics piece is worth your time is to ask what would change the writer’s mind. If the answer is nothing, you are not reading analysis. You are reading merchandising with a spreadsheet costume.

A useful stat argument should have escape hatches built into it. Maybe the sample is only 240 minutes. Maybe the opponent quality is weird. Maybe the player just changed roles. Maybe the model likes the shot profile but hates the person taking the shots. These are not annoying caveats to be swept under the rug. They are the rug.

That is why the best analytics writing feels less like a verdict and more like cross-examination. The stat says one thing. The tape objects. The schedule enters evidence. The injury report coughs loudly from the back row. Somewhere in that mess is the part worth writing down.

A short, opinionated reading list

You do not need a course. You need to read writers who do this carefully. Start with Zach Lowe for NBA tactical writing that earns its data. Read FBref’s xG explainer if you want to understand expected goals from the public-data side. Spend an afternoon with the NBA Stats glossary not to memorize it but to see how the league defines its own terms.

For sport-specific deep dives, our guide to NBA advanced stats and our piece on expected goals in soccer are the next two stops. Both assume you have read this one.

One last thing

Ten years in, the number I think about most is not on any leaderboard. It is the rate at which I was confidently wrong, divided by the rate at which I was confidently right, in the period before I learned to ask the questions above. Analytics did not make me a sharper fan because it gave me a glossary. It made me a sharper fan because it taught me how to lose an argument with myself. That is the whole pitch. Everything else is decoration.

The four metric families, in one table

If you wanted the entire working primer compressed into a single reference card, this is it. Each row is a family the article spent paragraphs on. The point is to keep the families separate even when the conversations blur them.

Family	What it answers	Sport-specific example	Where it breaks
Efficiency	How many points per attempt	True shooting (NBA), goals per shot (soccer)	Ignores role and shot difficulty
Pace and possession	What rate is the team playing	Points per 100 possessions (NBA), shots per 90 (soccer)	Conflates style with quality
On/off and lineup	Who actually moves the team	Five-man net rating, on-court xG	Lies in small samples
Expected outcomes	Process vs result over time	xG (soccer), expected eFG% (NBA), EPA (NFL)	Cannot model goalkeeper or defender skill fully

If a piece of sports writing uses one of these without naming which family it belongs to, the writer is asking the metric to carry weight it cannot hold. The categories are not bureaucratic. They are what keeps a stat from being recruited into the wrong argument.

Frequently asked questions

Do I need a stats degree to read this stuff well?

No. The math involved in true shooting percentage, points per 100, and xG is high-school arithmetic. The hard part is not the math. The hard part is the discipline to ask “compared to what?” before assigning meaning to the number. That is a reading skill more than a quantitative one.

What is the difference between “advanced stats” and “analytics”?

Practically, they describe the same project: extracting better signal from sports performance data than the box score gives you. “Advanced stats” tends to mean specific metrics (TS%, EPA, xG). “Analytics” tends to mean the broader practice of using those metrics inside a workflow — model, test, update, repeat. The vocabularies overlap. The motives are identical.

Which sports have the most mature analytics scenes?

Baseball is the oldest and most settled. NBA analytics is the most public and most aggressively debated. Soccer analytics matured later but caught up quickly. NFL and college football have strong proprietary work but weaker public-facing tools. WNBA analytics is in a growth phase. Each sport’s tooling has its own quirks, but the underlying questions translate.

Can analytics predict who will win a game?

Better than chance, not nearly perfectly. Public models like FiveThirtyEight’s old NBA forecasts hit roughly 65-70% accuracy on individual game picks, which sounds modest until you realize how much variance lives in a single game. Predictions about full seasons or playoff series are far more accurate than predictions about Tuesday nights. The metric for trustworthy analysis is calibration, not certainty.

The takeaway, in one paragraph

Sports analytics is not a vocabulary test. It is a habit of reading. Pick a metric family. Ask what question it answers. Check whether the writer is using it inside its family or smuggling it into someone else’s argument. For the sport-specific deep dives that pick up where this primer ends, our NBA advanced stats field guide and our expected goals explainer are the two natural next stops.

Sports Analytics, Honestly: The Working Primer

What “analytics” usually means when smart people say it