Sprint velocity — the number of story points completed per sprint — is a widely used but fundamentally flawed engineering productivity metric because it creates perverse incentives: teams inflate story points to hit targets, producing the appearance of productivity while actual feature output declines. Better alternatives include deployment frequency (how often code ships to production), change lead time (time from commit to production), and change failure rate (percentage of deployments causing incidents) — the DORA metrics framework. These metrics measure outcomes engineers actually care about (shipping, speed, quality) rather than arbitrary point estimates that reward gaming over genuine productivity.
At UshaOm, I used sprint velocity as a planning tool for two years. Our velocity was steadily increasing — until I realized engineers were inflating story points to make the numbers look better.
By Priya Shankar
Your team runs two-week sprints. At the end of the sprint, the engineering manager stands up and announces: "We committed to 40 story points and we completed 40 story points. Velocity is stable."
Everyone nods. This is seen as healthy. The team is predictable. You can plan based on this velocity.
But here's the thing: the team completed fewer features than they did last quarter. Features that should have shipped didn't. But velocity says everything is fine.
That's because velocity measures a social construct, not reality. Story points are estimates made by humans about work that is uncertain. When velocity becomes your metric, teams learn to inflate estimates. Not maliciously. Naturally. Because hitting the target is now what they're graded on.
This is a version of Goodhart's Law: when a metric becomes a target, it stops being a useful metric.
Sprint velocity is lying to you. Here's what's actually happening.
How Velocity Inflates
The mechanism is simple and almost invisible. A team ships 10 features in a sprint and calls it 40 points. The next sprint, they ship 10 similar features but the engineering manager says "the first sprint we underestimated." So they call it 50 points now. Velocity went up. But the number of shipped features stayed the same.
This isn't intentional gaming ( - it's calibration. Every team naturally recalibrates their estimates toward hitting their target. Humans are really good at this. We're not even aware we're doing it.
Over time, what happens is that a "point" means something different every quarter. A 5-point story in Q3 is not the same amount of work as a 5-point story in Q1. But velocity says it is. So your planning based on velocity is planning based on a moving target.
I watched a team whose velocity climbed from 40 points per sprint to 60 points per sprint over a year. The engineering manager was proud. "We're getting faster!" But when I asked how many features shipped per sprint, the number stayed roughly constant. They weren't getting faster. They were just estimating the same work as higher points.
Velocity Doesn't Capture Debt
The bigger problem is that velocity doesn't account for technical debt. Or rather, it accounts for it really badly.
A team can ship high velocity for several quarters by taking shortcuts. Not committing to tests. Not refactoring as they go. Not thinking about coupling. The velocity is great. The codebase is getting messier.
Then one quarter, they hit a wall. The debt comes due. They spend 60% of the sprint paying down technical debt. Velocity crashes. Everyone panics. "What happened? Did we lose people?"
No. You didn't lose people. You accumulated debt for four quarters and now you're paying the interest.
The way velocity works, it makes debt invisible until it becomes expensive. By then it's too late to have planned for it. You're in crisis mode, not planning mode.
The teams I've seen that maintain consistent velocity over years are the ones that take debt seriously every quarter. They're slightly slower in the short term because they're paying for code quality as they go. But velocity doesn't measure that benefit. It just measures throughput.
Velocity Isn't Comparable
Here's another problem: you can't compare velocity across teams.
Your iOS team says they have 60-point velocity. Your backend team says 40. Which team is more productive?
You don't know. A point is whatever the team thinks a point is. The iOS team might be estimating conservatively. The backend team might be estimating aggressively. The 20-point difference is meaningless.
You also can't compare velocity across time if the team composition changes. A new engineer joins, velocity drops, because the team is training them. Then velocity climbs back up. It looks like normal volatility, but it's actually a sign of higher team volatility and potential retention risk.
Worse, you can't compare velocity across sprints if the sprint scope changes. If you're in a sprint where you're doing a migration, you might commit to less feature work but the velocity might actually represent more total work. Velocity doesn't capture that.
The only honest thing you can say about velocity is: "This team completed this many points this sprint." Anything beyond that is extrapolation.
What to Track Instead
Here are metrics that actually tell you something:
First: deployment frequency. How many times a week do you ship to production? This is hard to game. Either you ship or you don't. A team shipping five times a week is more productive than a team shipping once a week, all else equal. This metric is honest. And it correlates with product outcomes ( - shipping faster means getting feedback faster.
Second: change lead time. How long does it take from "code committed" to "running in production"? This tells you how fast your deployment pipeline is and how much ceremony exists around shipping. A team with a one-hour change lead time is different from a team with a one-week change lead time. The difference is real.
Third: feature cycle time. How long from "spec approved" to "shipped to users"? This is the full cycle time that actually matters. Not how fast engineering works, but how fast you can go from "here's the requirement" to "users have it." This is what stakeholders actually care about.
Fourth: mean time to recovery for incidents. When something breaks, how long to fix it? This tells you about code clarity, testing quality, and operational maturity. A team that recovers from incidents in 30 minutes is healthier than a team that takes three hours.
Fifth: change success rate. What percentage of deployments cause incidents or issues? A team with a 95% success rate is different from a team with a 70% success rate. This tells you about quality and testing.
The Relationship to Codebase Health
Here's what's interesting: all of these metrics correlate with codebase health better than velocity does.
A team with a healthy codebase ( - clear ownership, visible dependencies, well-tested code ( - will have high deployment frequency, low change lead time, and high change success rate. You can measure codebase health by measuring these outcomes.
A team with a messy codebase will have low deployment frequency, high change lead time, and low change success rate. The codebase is slowing them down in ways velocity never captures.
The reason to care about this is that it inverts the conversation. Instead of "how do we increase velocity," the question becomes "how do we make the codebase healthier." And codebase health improvements have real ROI in outcomes you can measure.
How to Transition
If your team is currently velocity-focused, you don't need to abandon it overnight. But start measuring and tracking the other metrics. Make them visible in your planning meetings. Talk about them.
Within two ( - three) quarters, you'll start noticing that the metrics tell a different story than velocity. Velocity says everything is fine. But deployment frequency is dropping, change lead time is increasing, and change success rate is down. That's the truth.
Once you see that, you can start making decisions based on reality instead of a fiction you've unconsciously created.
The move is not to attack velocity directly. It's to say "here are metrics that matter more" and let velocity recede into the background as the less relevant metric it actually is.
Frequently Asked Questions
Q: Does this mean we should abandon agile and two-week sprints?
No. Sprints are fine. Planning in iteration is fine. The problem isn't sprints. The problem is using story points as a productivity metric. You can keep sprints and kill velocity as a target.
Q: What if my leadership requires velocity tracking?
You can track velocity as a historical record without making it a target. "Here's how many points we completed" is different from "here's the velocity we committed to and we hit it." One is measurement. The other is goal-setting. Only one of them creates perverse incentives.
Q: How do we convince engineers to care about metrics other than velocity?
Show them that the new metrics actually measure things they care about. Deployment frequency measures "how often do I get to ship?" Change lead time measures "how long before my work goes to users?" Change success rate measures "how many times did my code break things?" These are things engineers care about more than hitting a story point target.
Related Reading
- Sprint Velocity: The Misunderstood Metric
- Cycle Time: Definition, Formula, and Why It Matters
- DORA Metrics: The Complete Guide for Engineering Leaders
- Programmer Productivity: Why Measuring Output Is the Wrong Question
- Software Productivity: What It Really Means and How to Measure It
- Automated Sprint Planning: How AI Agents Build Better Sprints
- Velocity Doesn't Tell You How Far You Need to Go
- Why Software Estimation Is Structurally Hard
- What Is Sprint Estimation?
- What Is Velocity Estimation?
- Sprint Intelligence Loop