January 26, 2025. AFC Championship in Kansas City. The Bills produce one of the cleanest offensive performances of their season by one metric and one of their more frustrating by another. Their success rate sits in the mid-fifties for the night — well above league average and the kind of efficiency that usually wins on the road. Their EPA per play is essentially flat.
Buffalo moves the chains all night, stalls inside the twenty twice, settles for two field goals, and loses by three. The two stats, both built off the same play-by-play data, tell two different stories about the same offense.
This kind of split is not a glitch. It is the most useful thing two well-built metrics can do — disagree in a way that points you toward the part of the game the box score will not surface on its own. Success rate is built to reward consistency. EPA is built to reward the points hidden inside efficient possessions. When they agree, you have a clean read on an offense. When they split, the split is the article.
The piece below is the working version of the difference. What each stat actually measures, where they overlap, where they break apart, and the short workflow we use at SportsHighLight before quoting either of them in a write-up.
Quick read: success rate and EPA in 60 seconds
- Success rate measures whether a play “succeeded” against a yardage threshold tied to down and distance. It rewards consistency. Average is roughly 45% league-wide.
- EPA (Expected Points Added) measures how much a single play changed the team’s expected points before and after. It rewards big plays and punishes drive-killing ones disproportionately.
- They agree when an offense is both consistent and explosive (or consistently bad and turnover-prone). Most regular-season reads.
- They split when an offense moves the chains methodically without scoring (high SR, low EPA) or when a few big plays carry a quieter possession profile (low SR, high EPA).
- Which to trust: use both. SR explains how often the offense stays on schedule. EPA explains how much it actually produced.
Success rate, in plain language
Success rate looks at each play and asks one question: did this play gain enough yards to keep the drive on track? The conventional cutoffs, popularized by Football Outsiders and now standard in nflfastR, are roughly 40% of yards needed on first down, 60% on second down, and 100% on third or fourth. A first-and-ten that gains four yards is a “successful” play. A second-and-six that gains three yards is not. Add them all up across a game or season and the percentage is your success rate.
The appeal is that the metric strips out the volatility that makes single plays misleading. A 60-yard touchdown looks great. So does a five-yard run on first down. Success rate gives both plays appropriate weight inside the drive logic, instead of letting the long score paper over a string of stalled possessions.
Success rate stabilizes after roughly 150-200 offensive plays, which is about three full NFL games for an offense. By the time a team has played four or five games, its success rate is reasonably informative. Pro Football Reference and Sumer Sports both publish team and player success rate splits across down, distance, and personnel groupings.
EPA, in plain language
EPA stands for Expected Points Added. Every situation on a football field has, based on historical data, an “expected points” value — what the average team scores on this possession given down, distance, and field position. A first-and-ten on your own twenty is worth roughly +0.4 expected points. A third-and-eight from your own forty is closer to +0.1. A first-and-goal at the one is around +5.5. Each play either improves the offense’s expected points or decreases them. The difference is the play’s EPA.
The mechanic matters because it captures things success rate cannot. A successful third-and-eight conversion adds far more EPA than a successful first-and-ten plunge. A sack on second-and-ten subtracts more EPA than the same yardage loss on first-and-ten. EPA is, in effect, success rate with leverage applied.
EPA stabilizes a bit faster than success rate at the team level — roughly 100-150 plays — because each play carries more information per observation. Public play-by-play data via nflfastR lets anyone replicate EPA at home. Ben Baldwin’s rbsdm.com is the most widely-used public dashboard.
If EPA is new to you, the full architecture lives in our EPA explainer. The rest of this piece assumes you have at least the headline version.
The two stats compared, side by side
The table below is the version of the comparison we keep open when writing a Sunday recap. It is not exhaustive. It is the shortlist of differences that actually show up in coverage.
| Dimension | Success rate | EPA per play |
|---|---|---|
| What it rewards | Consistency: staying ahead of the chains | Leverage: producing points relative to situation |
| Best at measuring | How often an offense stays on schedule | How much an offense actually produces per play |
| Sensitivity to big plays | Low: a 60-yard run counts as one successful play | High: a 60-yard run can swing EPA by 4+ points |
| Sensitivity to red-zone outcomes | Moderate: a stalled red-zone drive still has successful plays | High: failing to score from inside the 20 carries large negative EPA |
| Stabilization (team level) | ~150-200 plays | ~100-150 plays |
| Where to find it | Football Outsiders, Pro Football Reference, Sumer Sports | nflfastR, rbsdm.com, ESPN, PFF |
| Common misread | Treating a high SR as evidence the offense is “good” | Treating a single big-play game as a sustainable trend |
None of these differences require math beyond the table. They do require remembering that a metric’s strengths are also its blind spots. Success rate misses the big play. EPA over-rewards it. Reading them together cancels out most of the noise.
Three games where the two stats disagreed
The cleanest way to teach the difference is to look at games where the split was real and the explanation was simple.
Buffalo at Kansas City, AFC Championship 2025. Bills offensive success rate approximately 52%; EPA per play approximately -0.01. The disagreement was almost entirely a red-zone story. Buffalo moved the chains efficiently between the twenties, then failed to score touchdowns on two trips inside the ten. Success rate kept counting first-down conversions. EPA priced in the cost of settling for field goals against an offense that, at the other end, was producing points on similar territory.
Detroit at Chicago, late 2024 regular season. Lions offensive success rate approximately 46% (basically league average); EPA per play approximately +0.21 (top-tier). The split was the inverse of the Bills game. Detroit had several quiet possessions interrupted by explosive plays — two long touchdowns and a 40-yard completion that set up a third score. Success rate underweighted those moments because each was one play. EPA weighted them correctly. The Lions’ offensive efficiency that night was real even though the chain-moving cadence was ordinary.
Houston, October 2024 stretch. Texans success rate hovered above 48% across a four-game stretch; EPA per play sat below zero. The cause was a known one in NFL analytics: third-and-long inefficiency. Houston was beating first and second downs frequently enough to keep success rate respectable, but converting third-and-eight at a rate well below league average, which dragged EPA into negative territory because failed third-down conversions carry large negative EPA values. The team eventually changed offensive coordinators. The split had warned about the problem weeks earlier.
A decision framework: which stat to trust when
This is the short version of the workflow we run before quoting either number. The table below maps common analytical questions to which stat answers them more honestly.
| Question you are asking | Use this stat first | Why |
|---|---|---|
| Is this offense staying on schedule? | Success rate | Built explicitly for chain-moving consistency |
| How much did this offense actually produce? | EPA per play | Reflects points context-adjusted, not just drive continuity |
| How good is this team in the red zone? | EPA (with explicit red-zone filter) | Captures the cost of stalled drives that SR underweights |
| Should I trust a four-game hot stretch? | Both, then check stabilization | Four games is roughly 240 plays; both stats start being meaningful |
| Is this QB efficient or just productive? | EPA per dropback | Yards per attempt confuses volume with quality; EPA does not |
| Is this offense built to score or to grind? | SR and EPA together | High SR + low EPA = grind. Low SR + high EPA = boom-or-bust |
| Did this single game tell me anything? | Neither alone | Single-game samples have too much variance for either stat to anchor an argument |
The pattern is that the right answer is almost never “use one.” The right answer is “use both and read the gap.” That gap is where the article lives.
Why both stats have earned their place
Plenty of NFL metrics get popular and then quietly disappear when the next better tool arrives. Quarterback rating did exactly that. Success rate and EPA have both survived more than a decade of public scrutiny precisely because they answer different questions cleanly. Neither tries to be the only number on the page. Each plays a role inside a workflow that, used together, beats either alone.
The frame we use to decide whether a metric like these is worth quoting at all lives in our useful metric piece: stability, falsifiability, and disagreement with the obvious read of the game. Success rate and EPA pass all three. Most public NFL stats do not. The DVOA family from Football Outsiders is the third member of this small club — our DVOA explainer covers how the opponent-adjusted version of these ideas extends the framework further.
Frequently asked questions
If I can only track one stat, which should it be?
EPA per play. It carries more information per observation, stabilizes faster, and handles red-zone and turnover situations more honestly. Success rate is a useful complement, but if forced to pick one, EPA wins. For QB-specific evaluation, EPA per dropback is the modern equivalent of the old quarterback rating, and it is the one most public analysts (including ESPN’s QBR derivatives) lean on.
Why do success rate cutoffs use 40% of yards on first down?
The cutoffs were chosen historically to mark the difference between “on schedule” and “behind the chains” based on average NFL drive outcomes. A first-down play gaining 40% of yards-to-go (typically 4 of 10) puts the offense in a manageable second-and-six. Anything less leaves them in second-and-seven-plus, where conversion rates fall sharply. The math is empirical rather than theoretical. Football Outsiders and Sumer Sports both publish slightly different conversion baselines.
Does EPA handle turnovers correctly?
Yes, and aggressively. A turnover swings expected points by 4-6 points typically, because the opponent gets the ball plus the offense loses the points it was likely to score. EPA punishes turnovers heavily, which is one reason teams with low turnover rates often look better in EPA than in standard offensive yardage rankings. This is also why a single pick-six can drop a quarterback’s EPA per play by a noticeable amount across a small sample.
What is the relationship between EPA and DVOA?
DVOA (Defense-adjusted Value Over Average), Football Outsiders’ opponent-adjusted efficiency metric, is essentially EPA with two layers added: opponent strength adjustment and game-state weighting. The two metrics correlate strongly across a season — roughly 0.75-0.80 at the team level — but DVOA tends to be more useful for season-end comparisons across teams that played different schedules. Our DVOA piece covers the methodology in full.
The takeaway, in one paragraph
Success rate tells you how often. EPA tells you how much. Used together, they describe an offense more honestly than either does alone, and the gap between them is almost always where the story lives. The Bills lost a championship game where success rate said they were fine and EPA said they were not. Both numbers were correct. The article was the disagreement. For the related conversation about how the 4th-down decisions that show up inside these metrics have rewired NFL coaching, our 4th-down revolution piece picks up exactly where this one stops.



