Software Development Metrics to Track

The most important metrics for software development are the four DORA metrics (deployment frequency, lead time for changes, change failure rate, MTTR), supplemented by code review turnaround time, sprint goal completion rate, and developer satisfaction. Teams should track 5–7 core metrics covering speed, quality, and sustainability — and avoid vanity metrics like lines of code or raw commit counts that incentivize the wrong behaviors.

The first engineering metrics I ever tracked were lines of code and bug count — at Shiksha Infotech, where I was a solution architect building Java monitoring tools. Those metrics told me almost nothing useful. By the time I was CTO at Salesken, I'd learned that the right metrics in software engineering depend on what question you're actually trying to answer. "Are we shipping?" is a different question from "Are we shipping well?" which is a different question from "Are we getting better?"

Software development metrics are everywhere. Pick any engineering conference and you'll hear about velocity, cycle time, test coverage, deployment frequency, and a dozen others. Coding metrics that actually matter separate signal from noise—helping you understand which metrics drive real business outcomes. Understanding metrics for your team in this broader context is essential for strategic decisions. Yet most engineering teams are still flying blind—tracking the wrong metrics or none at all.

The problem isn't that metrics don't matter. It's that not all metrics matter equally, and different stakeholders care about different things. A developer cares about build time and review feedback. A manager cares about velocity and sprint predictability. An executive cares about engineering ROI and time to market.

This guide separates signal from noise. We'll walk through the metrics that actually matter, organized by who cares about them. We'll also cover the metrics that sound useful but will mislead you if you optimize for them.

Metrics for Developers: Making Your Own Job Better

Developers care about metrics that directly impact their daily experience. These metrics reveal friction in the development process and opportunities to reduce wasted time.

Build Time

What it is: Time from initiating a build until it completes and is ready for testing or deployment.

Why developers care: Slow builds waste time and interrupt flow state. A 10-minute build means developers context-switch, check Slack, or work on something else. By the time the build finishes, they've lost focus.

Industry benchmarks: <2 minutes is excellent | 2-5 minutes is good | 5-15 minutes is acceptable | >15 minutes is a pain point

How to measure: Extract build time from your CI/CD logs (GitHub Actions, CircleCI, GitLab CI). Calculate median build time weekly.

How to improve: Profile your build to find the slow part. Often it's:

Unnecessary test runs (run only affected tests, not full suite)
Slow dependency downloads (cache dependencies)
Inefficient Docker builds (layer caching, multi-stage builds)
Unneeded linting on CI (lint locally, fail fast)

Real example: A React frontend team had 18-minute builds. They discovered:

Full dependency install on every build (no caching)
Running entire test suite even for single-file changes
Docker rebuilds from scratch (no layer caching)

Changes: dependency caching + selective test running + Docker layer caching = 2.5-minute build. That's 15.5 minutes per developer per day × 6 developers × 250 work days = 23,250 developer-minutes per year (388 hours) recovered.

Code Review Time (Feedback Loop)

What it is: Time from opening a pull request until receiving the first review feedback.

Why developers care: Slow code reviews block progress and kill momentum. A developer waits for review, and if no feedback comes for 24 hours, they work on something else. When they get back to the PR, they've lost context.

Industry benchmarks: <2 hours is excellent | 2-8 hours is good | 8-24 hours is acceptable | >24 hours is a blocker

How to measure: Extract PR creation time and first review comment time from GitHub/GitLab. Calculate median time from PR open to first feedback.

How to improve:

Make reviews a daily ritual (30 minutes each morning, not "whenever you find time")
Enforce SLAs (you review today, even if busy)
Pull senior devs into reviews (they unblock junior devs)
Limit concurrent PRs (so you review before taking new work)

Real example: A backend team had 40-hour average review time (reviews happened 2-3 times per week). They made a simple change: 30-minute code review block every morning from 10-10:30. Average review time: 4 hours. Same amount of work, just concentrated.

Test Coverage and Speed

What it is: Percentage of code covered by automated tests and time tests take to run.

Why developers care: Tests give confidence to ship, but slow tests kill productivity. Developers want:

Confidence that their changes don't break things (so coverage matters)
Fast feedback (slow tests defeat the purpose)

Industry benchmarks: Coverage: 60-80% | Test speed: full suite <5 minutes

How to measure: Coverage tools (Jest, pytest, Istanbul). Test speed from CI logs.

How to improve (coverage):

Identify gaps with coverage reports
Test critical paths (business logic, integrations)
Don't obsess over 100% (diminishing returns after 75-80%)

How to improve (speed):

Run unit tests locally and in parallel
Run integration tests in separate job
Run end-to-end tests only on release branches
Cache test results (don't rerun unchanged tests)

Real example: A Python backend had 85% coverage but tests took 12 minutes to run (full suite). Developers were only running tests locally for their own code. Issues snuck into CI. They:

Split tests into unit (2 min), integration (5 min), and e2e (15 min, release only)
Developers ran unit tests locally before pushing
PR CI ran unit + integration
Release CI ran everything False positives dropped 70%, and developers gained confidence.

Debugging and Incident Response

What it is: Time to diagnose and fix production issues.

Why developers care: Unplanned incident work is the worst kind of interrupt. A production bug at 4 PM that takes 2 hours to debug kills your next day of productivity.

Industry benchmarks: Mean Time to Recovery (MTTR): <15 minutes is excellent | 15-60 minutes is good | 1-4 hours is acceptable | >4 hours is bad

How to measure: Time from incident alert to system stable (usually in PagerDuty, Opsgenie, etc.).

How to improve:

Better observability (structured logging, distributed tracing, metrics)
Runbooks for common incidents
Blameless post-mortems (what failed in the system, not who messed up)
Automation for repetitive fixes (auto-rollback, auto-scaling, etc.)

Real example: A payments team had 90-minute MTTR. They added:

Distributed tracing (could see exactly which service failed)
Runbook for top 3 incidents (clear steps to resolve)
Automated rollback (if error rate spikes, revert last deploy) New MTTR: 12 minutes.

Onboarding Time

What it is: Time for a new developer to be productive (usually measured as "when did they submit their first PR").

Why developers care: Painful onboarding affects team morale and job satisfaction. A developer should feel productive in their first week, not lost for a month.

Industry benchmarks: <2 weeks is excellent | 2-4 weeks is good | 4-8 weeks is typical | >8 weeks is too long

How to measure: Track from start date to first PR merged. Ask departing developers about their onboarding experience (exit interviews).

How to improve:

Clear onboarding guide (written, not tribal knowledge)
Pre-configured dev environments (Docker, Docker Compose)
Assigned onboarding buddy (senior dev spends 2 hours/day first week)
Small, well-scoped first task (not "understand the entire codebase")
Automated setup (scripts that set up DB, run migrations, etc.)

Real example: A startup had 6-week onboarding (nobody documented anything). New dev's first week was spent asking "where's X?" and "how do I Y?". They created:

30-minute onboarding video (show where things live, how to run tests)
make setup command that initialized dev environment
"First Issues" label (curated easy tasks for new devs) Onboarding dropped to 2 weeks. First PR submitted by day 3.

Metrics for Managers: Planning and Team Health

Engineering managers care about different metrics. These reveal team capacity, predictability, and overall health.

Velocity

What it is: Amount of work a team completes per sprint, measured in story points.

Why managers care: Automated sprint planning makes velocity-based planning more accurate. Velocity is the foundation of planning. If you know your team completes 50 story points per sprint, you can predictably plan: 2-quarter roadmap = 8 sprints × 50 points = 400 points of capacity.

Industry benchmarks: Consistency matters more than absolute numbers. A team doing 45-50 points every sprint is more valuable than a team doing 30-70 points unpredictably.

How to measure: Sum story points of completed work per sprint. Track trend over 6-8 sprints.

How to improve:

Be consistent with estimation (use reference stories)
Account for non-sprint work (on-call, customer issues, technical debt)
Don't squeeze harder (pushing velocity down is often a sign of burnout)
Reduce blockers (unblock people waiting on others)

Real example: A team estimated velocity at 60 points but consistently delivered 40. Investigation showed:

20 points per sprint went to "unexpected production issues"
10-20 points lost to waiting on another team
Rest was optimistic estimation

They created an "incident tax" bucket (20 points reserved for surprises) and worked with the other team to reduce dependencies. Predictable 40-point velocity was better than chaotic 60.

Sprint Predictability

What it is: Ability to forecast what will be done by end of sprint.

Formula: (Points committed at start / Points actually completed) × 100

Industry benchmarks: >90% is excellent | 80-90% is good | 70-80% is acceptable | <70% is unpredictable

Why managers care: Unpredictable sprints make roadmap planning impossible. Customers can't trust your dates. Team morale suffers when you keep missing goals.

How to measure: Compare committed points (at sprint planning) to completed points (at sprint end).

How to improve:

Better estimation (involve team in pointing, use historical velocity)
Protect from interrupts (designate someone for urgent issues, shield rest of team)
Slice work smaller (smaller stories have more predictable estimates)
Don't pack sprints (leave 20% buffer for unknowns)

Real example: A mobile team committed 60 points, delivered 35 (58% predictability). They were getting 20-30 points of "urgent" work mid-sprint:

Customer escalations
"Quick fixes" from product
Production bugs

They created a "2-person interrupt buffer" (2 devs always available for urgent work, not sprint-assigned). Rest of team committed 40 points knowing 20 were reserved. New predictability: 95%.

Technical Debt Ratio

What it is: Percentage of sprint capacity spent on technical debt vs. new features.

Formula: (Story points of tech debt / Total story points) × 100

Industry benchmarks: 10-30% is healthy | <10% means you're skipping important maintenance | >30% means you're drowning in debt

Why managers care: Too little tech debt work and you slow down (compounding complexity). Too much and you're not delivering customer features. The balance matters.

How to measure: Tag issues as "Technical Debt" in your backlog. Sum their points. Divide by total sprint points.

How to improve:

Create a "tech debt budget" (commit 20% of capacity)
Make tech debt visible (dashboard showing debt ratio)
Tie tech debt to business impact ("refactoring X reduces defect rate by Y%")
Celebrate tech debt wins (public recognition when you pay down a major debt)

Real example: A company with zero tech debt allocation had defect rate increasing 2% per quarter (as codebase got messier). They reserved 20% of capacity for tech debt (refactoring, test coverage, documentation). Defect rate stabilized, and velocity actually increased (less time in debugging, review).

Team Morale and Burnout

What it is: Subjective health metric revealing if the team is happy and sustainable.

How to measure: Monthly 1-on-1 conversations. Simple question: "On a scale of 1-10, how much do you enjoy your job?" Trend the average. Follow up on anyone at <7.

Why managers care: Burnout kills retention, productivity, and quality. Burned-out teams make mistakes and ship worse products. Prevention is cheaper than hiring replacements.

How to improve:

Cap crunch periods (no more than 2 weeks of 50+ hour weeks)
Protect time off (no Slack during vacation)
Reduce on-call burden for overloaded individuals
Pay down technical debt (reduces frustration and context-switching)
Celebrate wins (recognize good work publicly)

Real example: A team averaged 7.2/10 morale dropping to 5.8/10 over 3 months. Manager investigation revealed:

Production was brittle (frequent incidents and pages during off-hours)
Lots of context-switching (unclear priorities, many half-finished projects)
No recent wins (shipping felt like whack-a-mole, not progress)

Fixes: automated testing (fewer incidents), WIP limit (fewer interrupts), public sprint retrospectives (visibility of progress). Morale recovered to 7.8/10.

Metrics for Executives: Business Alignment

Executives care about metrics that connect engineering to business outcomes. These are the metrics that justify engineering budgets.

Engineering ROI

What it is: Business value generated per dollar spent on engineering.

Formula: Annual revenue attributed to engineering / Total annual engineering spend

Industry benchmarks: Varies wildly, but 10:1 (every dollar of engineering generates $10 of revenue) is typical for healthy SaaS.

Why executives care: Engineering is a cost center in the P&L, but it generates revenue. Understanding ROI helps justify headcount, tools, and training investments.

How to measure:

For product features: track revenue of customers using the feature, subtract customer acquisition cost of others, compare to feature development cost
For platform/infrastructure: measure cost avoidance (how much would we spend on third-party services instead) or time saved (engineering time freed up for higher-value work)
For tools/automation: measure time saved × loaded labor cost

How to improve:

Focus on high-ROI features (customer-requested, well-validated)
Kill low-ROI features (beautiful code nobody uses = zero ROI)
Automate low-value tasks (manually generating reports = negative ROI)
Invest in platform/infrastructure (higher ROI than point solutions)

Real example: A SaaS company spent $2M annually on engineering. They calculated:

Core product features: $20M in revenue, $1.5M in cost = 13:1 ROI
Customer success integrations: $2M revenue, $300K cost = 6.7:1 ROI
Infrastructure/platform: Enables above features, $200K cost
Admin/process work: $0 revenue, $200K cost (candidates for automation)

They shifted 50% of admin work to automation (now $100K), reallocated engineers to high-ROI customer features. ROI improved from 10:1 to 12:1.

Time to Market

What it is: Time from feature request to production deployment.

Formula: Average days from green-light to customer-facing availability

Industry benchmarks: <2 weeks is excellent | 2-4 weeks is competitive | 1-2 months is typical | >3 months is slow

Why executives care: Competitors move fast. The team that ships features first wins customers. Time to market is a competitive advantage.

How to measure: Pick representative features. Track from approval to deployment. Calculate average.

How to improve:

Shorter feedback loops (less back-and-forth with stakeholders)
Parallel work (design and engineering concurrent, not sequential)
Feature flags (ship behind flags, light it up in prod without full release)
Reduced approval gates (do you really need 3 sign-offs?)
Smaller batches (ship one thing well vs. everything half-baked)

Real example: A fintech company had 12-week feature time-to-market (compliance reviews, security reviews, stakeholder reviews took months). They analyzed fast features (deployed <4 weeks):

APIs for partners (high value)
Dashboard improvements (tools teams use daily)
Integration fixes (customer requests)

They created a "fast track" for these high-impact types (4-week cycle, streamlined approvals). Non-critical features used the standard 12-week process. Overall average dropped to 6 weeks. Time-to-market for critical features improved 3x.

Customer Acquisition Cost vs. Engineering Cost

What it is: Revenue per engineering headcount vs. cost of acquiring customers.

Why executives care: Some engineering investments make sense from a customer acquisition perspective. Paying one engineer to build a self-serve onboarding might reduce CAC by 20%, providing huge ROI.

How to measure:

CAC: Total sales + marketing spend / New customers
Revenue per engineer: Annual revenue / Engineering headcount
Compare and understand the relationship

How to improve:

Focus engineering on self-serve features (reduces CAC)
Build integration/partnership features (unlocks new customer segments, reduces CAC)
Invest in retention (long-term revenue is worth more than acquisition)

Real example: A B2B software company had $150K CAC and $500K revenue per engineer. Executive question: "Should we invest in sales team or engineering team?" They calculated:

Hiring 5 sales people ($500K annual cost) would generate $3M revenue (assuming typical close rate and deal size)
Hiring 5 engineers ($1M cost) could build self-serve onboarding, reducing CAC to $100K. This unlocks 30% more revenue in existing markets = $2M annual incremental revenue

They chose engineering because the payoff (per-engineer ROI) was better, and the benefit compounds (saved customers buy more, refer others).

Metrics That Sound Useful But Aren't

Some metrics are tracking red herrings. They feel good to report but mislead decision-making.

Lines of Code

Why it seems useful: More code = more work, right?

Why it's wrong: More code usually means worse engineering. Lines of code is negatively correlated with quality. Experienced engineers write less code to solve problems. Juniors write verbose code.

What to track instead: Business outcomes (features shipped, bugs fixed, ROI).

Individual Contributor Metrics

Why it seems useful: Ranking developers by velocity, commits, or code review speed identifies top performers.

Why it's wrong: This incentivizes the wrong behavior (commit lots of small changes to look busy, approve PRs quickly without actually reviewing, etc.). It also kills team collaboration (why help junior developers if it reduces your individual metrics?).

What to track instead: Team outcomes. If your team velocity is high and quality is good, individuals are performing well.

Code Complexity Metrics

Why it seems useful: Complex code = future bugs, right?

Why it's wrong: Some problems are genuinely complex. Forcing simplistic solutions to reduce cyclomatic complexity creates worse code. Also, complexity metrics are easy to game (extract functions to reduce perceived complexity without improving maintainability).

What to track instead: Actual defects found, not predicted defects. If complex code has low bug escape rates, the developers understand it well.

100% Test Coverage

Why it seems useful: More tests = fewer bugs.

Why it's wrong: The last 20% of coverage usually has diminishing returns. It's also easy to write tests that pass but don't actually test anything meaningful. Developers gaming coverage percentages often write useless tests.

What to track instead: Bug escape rate. If your defect rate is low with 70% coverage, higher coverage probably won't help.

Utilization (% of time on billable work)

Why it seems useful: High utilization = high productivity.

Why it's wrong: 100% utilization is actually a sign of poor process. No time for:

Learning (reading docs, taking courses)
Code review (helping teammates)
Planning (thinking about what to build)
Retrospectives (how to improve)

Healthy teams run at 70-80% utilization.

What to track instead: Outcomes (features shipped, bugs fixed, team morale).

The Right Metrics Framework

Effective metric systems follow this pattern:

Start with business outcomes (revenue, customers, retention)
Work backwards to engineering outcomes (feature delivery, quality, speed)
Drill down to team-level metrics (velocity, cycle time, technical debt)
Measure individual experience (build time, review feedback, debugging ease)

Choose 3-5 metrics that align with your current biggest challenge. Optimize those. When you've improved, shift focus to the next constraint. Metrics are meant to guide strategy, not become your strategy.

The best engineering teams aren't obsessed with metrics. They're obsessed with shipping value, maintaining quality, and keeping their team happy. Frameworks like the DX Core 4 provide structure around those goals. Metrics help ensure those three things are actually happening.

Ready to get better visibility? Glue's AI agents continuously monitor these metrics across your codebase, deployment systems, and incident trackers. Rather than manual dashboards, engineering leaders get autonomous insights: anomalies surfaced early, questions answered in seconds, and trends analyzed without meetings. See how other engineering teams use AI agents to turn metrics into action.

Frequently Asked Questions

What are the most important software development metrics?

The most important metrics are the four DORA metrics (deployment frequency, lead time, change failure rate, MTTR) for delivery performance, supplemented by code review turnaround time, sprint goal completion rate, and developer satisfaction for team health. Avoid vanity metrics like lines of code.

How many metrics should a software team track?

Track 5-7 core metrics at the team level. More than that creates measurement fatigue and dilutes focus. Ensure your metrics cover speed (how fast you deliver, e.g. cycle time), quality (how reliable your deliveries are), and sustainability (how healthy your team and codebase are). Use engineering bottleneck detection to identify which metric deserves focus first.

Metrics for Developers: Making Your Own Job Better

Developers care about metrics that directly impact their daily experience. These metrics reveal friction in the development process and opportunities to reduce wasted time.

Build Time

What it is: Time from initiating a build until it completes and is ready for testing or deployment.

Industry benchmarks: <2 minutes is excellent | 2-5 minutes is good | 5-15 minutes is acceptable | >15 minutes is a pain point

How to measure: Extract build time from your CI/CD logs (GitHub Actions, CircleCI, GitLab CI). Calculate median build time weekly.

How to improve: Profile your build to find the slow part. Often it's:

Unnecessary test runs (run only affected tests, not full suite)
Slow dependency downloads (cache dependencies)
Inefficient Docker builds (layer caching, multi-stage builds)
Unneeded linting on CI (lint locally, fail fast)

Real example: A React frontend team had 18-minute builds. They discovered:

Full dependency install on every build (no caching)
Running entire test suite even for single-file changes
Docker rebuilds from scratch (no layer caching)

Code Review Time (Feedback Loop)

What it is: Time from opening a pull request until receiving the first review feedback.

Industry benchmarks: <2 hours is excellent | 2-8 hours is good | 8-24 hours is acceptable | >24 hours is a blocker

How to measure: Extract PR creation time and first review comment time from GitHub/GitLab. Calculate median time from PR open to first feedback.

How to improve:

Make reviews a daily ritual (30 minutes each morning, not "whenever you find time")
Enforce SLAs (you review today, even if busy)
Pull senior devs into reviews (they unblock junior devs)
Limit concurrent PRs (so you review before taking new work)

Test Coverage and Speed

What it is: Percentage of code covered by automated tests and time tests take to run.

Why developers care: Tests give confidence to ship, but slow tests kill productivity. Developers want:

Confidence that their changes don't break things (so coverage matters)
Fast feedback (slow tests defeat the purpose)

Industry benchmarks: Coverage: 60-80% | Test speed: full suite <5 minutes

How to measure: Coverage tools (Jest, pytest, Istanbul). Test speed from CI logs.

How to improve (coverage):

Identify gaps with coverage reports
Test critical paths (business logic, integrations)
Don't obsess over 100% (diminishing returns after 75-80%)

How to improve (speed):

Run unit tests locally and in parallel
Run integration tests in separate job
Run end-to-end tests only on release branches
Cache test results (don't rerun unchanged tests)

Real example: A Python backend had 85% coverage but tests took 12 minutes to run (full suite). Developers were only running tests locally for their own code. Issues snuck into CI. They:

Split tests into unit (2 min), integration (5 min), and e2e (15 min, release only)
Developers ran unit tests locally before pushing
PR CI ran unit + integration
Release CI ran everything False positives dropped 70%, and developers gained confidence.

Debugging and Incident Response

What it is: Time to diagnose and fix production issues.

Why developers care: Unplanned incident work is the worst kind of interrupt. A production bug at 4 PM that takes 2 hours to debug kills your next day of productivity.

Industry benchmarks: Mean Time to Recovery (MTTR): <15 minutes is excellent | 15-60 minutes is good | 1-4 hours is acceptable | >4 hours is bad

How to measure: Time from incident alert to system stable (usually in PagerDuty, Opsgenie, etc.).

How to improve:

Better observability (structured logging, distributed tracing, metrics)
Runbooks for common incidents
Blameless post-mortems (what failed in the system, not who messed up)
Automation for repetitive fixes (auto-rollback, auto-scaling, etc.)

Real example: A payments team had 90-minute MTTR. They added:

Distributed tracing (could see exactly which service failed)
Runbook for top 3 incidents (clear steps to resolve)
Automated rollback (if error rate spikes, revert last deploy) New MTTR: 12 minutes.

Onboarding Time

What it is: Time for a new developer to be productive (usually measured as "when did they submit their first PR").

Why developers care: Painful onboarding affects team morale and job satisfaction. A developer should feel productive in their first week, not lost for a month.

Industry benchmarks: <2 weeks is excellent | 2-4 weeks is good | 4-8 weeks is typical | >8 weeks is too long

How to measure: Track from start date to first PR merged. Ask departing developers about their onboarding experience (exit interviews).

How to improve:

Clear onboarding guide (written, not tribal knowledge)
Pre-configured dev environments (Docker, Docker Compose)
Assigned onboarding buddy (senior dev spends 2 hours/day first week)
Small, well-scoped first task (not "understand the entire codebase")
Automated setup (scripts that set up DB, run migrations, etc.)

Real example: A startup had 6-week onboarding (nobody documented anything). New dev's first week was spent asking "where's X?" and "how do I Y?". They created:

30-minute onboarding video (show where things live, how to run tests)
make setup command that initialized dev environment
"First Issues" label (curated easy tasks for new devs) Onboarding dropped to 2 weeks. First PR submitted by day 3.

Metrics for Managers: Planning and Team Health

Engineering managers care about different metrics. These reveal team capacity, predictability, and overall health.

Velocity

What it is: Amount of work a team completes per sprint, measured in story points.

Industry benchmarks: Consistency matters more than absolute numbers. A team doing 45-50 points every sprint is more valuable than a team doing 30-70 points unpredictably.

How to measure: Sum story points of completed work per sprint. Track trend over 6-8 sprints.

How to improve:

Be consistent with estimation (use reference stories)
Account for non-sprint work (on-call, customer issues, technical debt)
Don't squeeze harder (pushing velocity down is often a sign of burnout)
Reduce blockers (unblock people waiting on others)

Real example: A team estimated velocity at 60 points but consistently delivered 40. Investigation showed:

20 points per sprint went to "unexpected production issues"
10-20 points lost to waiting on another team
Rest was optimistic estimation

They created an "incident tax" bucket (20 points reserved for surprises) and worked with the other team to reduce dependencies. Predictable 40-point velocity was better than chaotic 60.

Sprint Predictability

What it is: Ability to forecast what will be done by end of sprint.

Formula: (Points committed at start / Points actually completed) × 100

Industry benchmarks: >90% is excellent | 80-90% is good | 70-80% is acceptable | <70% is unpredictable

Why managers care: Unpredictable sprints make roadmap planning impossible. Customers can't trust your dates. Team morale suffers when you keep missing goals.

How to measure: Compare committed points (at sprint planning) to completed points (at sprint end).

How to improve:

Better estimation (involve team in pointing, use historical velocity)
Protect from interrupts (designate someone for urgent issues, shield rest of team)
Slice work smaller (smaller stories have more predictable estimates)
Don't pack sprints (leave 20% buffer for unknowns)

Real example: A mobile team committed 60 points, delivered 35 (58% predictability). They were getting 20-30 points of "urgent" work mid-sprint:

Customer escalations
"Quick fixes" from product
Production bugs

They created a "2-person interrupt buffer" (2 devs always available for urgent work, not sprint-assigned). Rest of team committed 40 points knowing 20 were reserved. New predictability: 95%.

Technical Debt Ratio

What it is: Percentage of sprint capacity spent on technical debt vs. new features.

Formula: (Story points of tech debt / Total story points) × 100

Industry benchmarks: 10-30% is healthy | <10% means you're skipping important maintenance | >30% means you're drowning in debt

Why managers care: Too little tech debt work and you slow down (compounding complexity). Too much and you're not delivering customer features. The balance matters.

How to measure: Tag issues as "Technical Debt" in your backlog. Sum their points. Divide by total sprint points.

How to improve:

Create a "tech debt budget" (commit 20% of capacity)
Make tech debt visible (dashboard showing debt ratio)
Tie tech debt to business impact ("refactoring X reduces defect rate by Y%")
Celebrate tech debt wins (public recognition when you pay down a major debt)

Team Morale and Burnout

What it is: Subjective health metric revealing if the team is happy and sustainable.

How to measure: Monthly 1-on-1 conversations. Simple question: "On a scale of 1-10, how much do you enjoy your job?" Trend the average. Follow up on anyone at <7.

Why managers care: Burnout kills retention, productivity, and quality. Burned-out teams make mistakes and ship worse products. Prevention is cheaper than hiring replacements.

How to improve:

Cap crunch periods (no more than 2 weeks of 50+ hour weeks)
Protect time off (no Slack during vacation)
Reduce on-call burden for overloaded individuals
Pay down technical debt (reduces frustration and context-switching)
Celebrate wins (recognize good work publicly)

Real example: A team averaged 7.2/10 morale dropping to 5.8/10 over 3 months. Manager investigation revealed:

Production was brittle (frequent incidents and pages during off-hours)
Lots of context-switching (unclear priorities, many half-finished projects)
No recent wins (shipping felt like whack-a-mole, not progress)

Fixes: automated testing (fewer incidents), WIP limit (fewer interrupts), public sprint retrospectives (visibility of progress). Morale recovered to 7.8/10.

Metrics for Executives: Business Alignment

Executives care about metrics that connect engineering to business outcomes. These are the metrics that justify engineering budgets.

Engineering ROI

What it is: Business value generated per dollar spent on engineering.

Formula: Annual revenue attributed to engineering / Total annual engineering spend

Industry benchmarks: Varies wildly, but 10:1 (every dollar of engineering generates $10 of revenue) is typical for healthy SaaS.

Why executives care: Engineering is a cost center in the P&L, but it generates revenue. Understanding ROI helps justify headcount, tools, and training investments.

How to measure:

For product features: track revenue of customers using the feature, subtract customer acquisition cost of others, compare to feature development cost
For platform/infrastructure: measure cost avoidance (how much would we spend on third-party services instead) or time saved (engineering time freed up for higher-value work)
For tools/automation: measure time saved × loaded labor cost

How to improve:

Focus on high-ROI features (customer-requested, well-validated)
Kill low-ROI features (beautiful code nobody uses = zero ROI)
Automate low-value tasks (manually generating reports = negative ROI)
Invest in platform/infrastructure (higher ROI than point solutions)

Real example: A SaaS company spent $2M annually on engineering. They calculated:

Core product features: $20M in revenue, $1.5M in cost = 13:1 ROI
Customer success integrations: $2M revenue, $300K cost = 6.7:1 ROI
Infrastructure/platform: Enables above features, $200K cost
Admin/process work: $0 revenue, $200K cost (candidates for automation)

They shifted 50% of admin work to automation (now $100K), reallocated engineers to high-ROI customer features. ROI improved from 10:1 to 12:1.

Time to Market

What it is: Time from feature request to production deployment.

Formula: Average days from green-light to customer-facing availability

Industry benchmarks: <2 weeks is excellent | 2-4 weeks is competitive | 1-2 months is typical | >3 months is slow

Why executives care: Competitors move fast. The team that ships features first wins customers. Time to market is a competitive advantage.

How to measure: Pick representative features. Track from approval to deployment. Calculate average.

How to improve:

Shorter feedback loops (less back-and-forth with stakeholders)
Parallel work (design and engineering concurrent, not sequential)
Feature flags (ship behind flags, light it up in prod without full release)
Reduced approval gates (do you really need 3 sign-offs?)
Smaller batches (ship one thing well vs. everything half-baked)

Real example: A fintech company had 12-week feature time-to-market (compliance reviews, security reviews, stakeholder reviews took months). They analyzed fast features (deployed <4 weeks):

APIs for partners (high value)
Dashboard improvements (tools teams use daily)
Integration fixes (customer requests)

Customer Acquisition Cost vs. Engineering Cost

What it is: Revenue per engineering headcount vs. cost of acquiring customers.

How to measure:

CAC: Total sales + marketing spend / New customers
Revenue per engineer: Annual revenue / Engineering headcount
Compare and understand the relationship

How to improve:

Focus engineering on self-serve features (reduces CAC)
Build integration/partnership features (unlocks new customer segments, reduces CAC)
Invest in retention (long-term revenue is worth more than acquisition)

Real example: A B2B software company had $150K CAC and $500K revenue per engineer. Executive question: "Should we invest in sales team or engineering team?" They calculated:

Hiring 5 sales people ($500K annual cost) would generate $3M revenue (assuming typical close rate and deal size)
Hiring 5 engineers ($1M cost) could build self-serve onboarding, reducing CAC to $100K. This unlocks 30% more revenue in existing markets = $2M annual incremental revenue

They chose engineering because the payoff (per-engineer ROI) was better, and the benefit compounds (saved customers buy more, refer others).

Metrics That Sound Useful But Aren't

Some metrics are tracking red herrings. They feel good to report but mislead decision-making.

Lines of Code

Why it seems useful: More code = more work, right?

What to track instead: Business outcomes (features shipped, bugs fixed, ROI).

Individual Contributor Metrics

Why it seems useful: Ranking developers by velocity, commits, or code review speed identifies top performers.

What to track instead: Team outcomes. If your team velocity is high and quality is good, individuals are performing well.

Code Complexity Metrics

Why it seems useful: Complex code = future bugs, right?

What to track instead: Actual defects found, not predicted defects. If complex code has low bug escape rates, the developers understand it well.

100% Test Coverage

Why it seems useful: More tests = fewer bugs.

What to track instead: Bug escape rate. If your defect rate is low with 70% coverage, higher coverage probably won't help.

Utilization (% of time on billable work)

Why it seems useful: High utilization = high productivity.

Why it's wrong: 100% utilization is actually a sign of poor process. No time for:

Learning (reading docs, taking courses)
Code review (helping teammates)
Planning (thinking about what to build)
Retrospectives (how to improve)

Healthy teams run at 70-80% utilization.

What to track instead: Outcomes (features shipped, bugs fixed, team morale).

The Right Metrics Framework

Effective metric systems follow this pattern:

Start with business outcomes (revenue, customers, retention)
Work backwards to engineering outcomes (feature delivery, quality, speed)
Drill down to team-level metrics (velocity, cycle time, technical debt)
Measure individual experience (build time, review feedback, debugging ease)

Frequently Asked Questions

What are the most important software development metrics?

How many metrics should a software team track?

Metrics for Software Development: What Your Team Should Track and Why

Metrics for Developers: Making Your Own Job Better

Build Time

Code Review Time (Feedback Loop)

Test Coverage and Speed

Debugging and Incident Response

Onboarding Time

Metrics for Managers: Planning and Team Health

Velocity

Sprint Predictability

Technical Debt Ratio

Team Morale and Burnout

Metrics for Executives: Business Alignment

Engineering ROI

Time to Market

Customer Acquisition Cost vs. Engineering Cost

Metrics That Sound Useful But Aren't

Lines of Code

Individual Contributor Metrics

Code Complexity Metrics

100% Test Coverage

Utilization (% of time on billable work)

The Right Metrics Framework

Related Reading

Frequently Asked Questions

More articles

Engineering Metrics Examples: 20+ Key Metrics Your Team Should Track

LinearB vs Jellyfish vs Swarmia: What Each Measures, What Each Misses, and When to Pick Something Else

What Are DORA Metrics? A Beginner's Guide to Measuring Software Delivery Performance

Stop stitching. Start shipping.

Metrics for Software Development: What Your Team Should Track and Why

Metrics for Developers: Making Your Own Job Better

Build Time

Code Review Time (Feedback Loop)

Test Coverage and Speed

Debugging and Incident Response

Onboarding Time

Metrics for Managers: Planning and Team Health

Velocity

Sprint Predictability

Technical Debt Ratio

Team Morale and Burnout

Metrics for Executives: Business Alignment

Engineering ROI

Time to Market

Customer Acquisition Cost vs. Engineering Cost

Metrics That Sound Useful But Aren't

Lines of Code

Individual Contributor Metrics

Code Complexity Metrics

100% Test Coverage

Utilization (% of time on billable work)

The Right Metrics Framework

Related Reading

Frequently Asked Questions

More articles

Engineering Metrics Examples: 20+ Key Metrics Your Team Should Track

LinearB vs Jellyfish vs Swarmia: What Each Measures, What Each Misses, and When to Pick Something Else

What Are DORA Metrics? A Beginner's Guide to Measuring Software Delivery Performance

Stop stitching. Start shipping.