By Arjun Mehta, Principal Engineer at Glue
I have been the engineer who inherited a codebase with no health metrics. No complexity scores, no churn analysis, no coverage tracking. Just 200,000 lines of code, a README that said "good luck," and a product team wondering why every feature estimate was wrong. If you are an engineering manager trying to understand code health metrics, the good news is that the measurement tools exist. The bad news is that most teams track the wrong things or track the right things in the wrong context.
This guide covers the metrics that actually matter, how they connect to business outcomes, and how to build a dashboard that gives you and your leadership team a shared understanding of codebase health.
Why Code Health Matters
Code health is not an abstract engineering concern. It has direct, measurable impact on your team's ability to deliver features, your organization's spending efficiency, and your retention rates.
According to Jellyfish's 2025 engineering benchmarks, developers spend 23-42% of their time managing technical debt. That is not a rounding error. For a team of 20 engineers at an average fully loaded cost of $200K per year, that translates to $920K to $1.68M annually spent on maintenance rather than new capabilities.
Stripe's developer research puts a finer point on it: engineers spend an average of 17 hours per week on maintenance tasks, including debugging, refactoring, and working around existing code problems. That is more than two full days of every week consumed by code health issues.
The Visibility Problem
The challenge is not that code health is hard to measure. It is that the measurements rarely reach the people who need them. Engineering managers, VPs, and CTOs make resourcing decisions based on project timelines and feature requests. If code health data does not surface in a format they can act on, it stays invisible until something breaks.
I have been in planning meetings where a PM asks for a "simple" feature and the room goes quiet because every engineer knows the module it touches is held together with duct tape. But nobody has the data to explain why that "simple" feature will actually take six weeks. The conversation becomes engineering saying "trust us, it is complicated" versus product saying "it looks straightforward." Neither side has evidence.
That is the gap code health metrics fill. They turn gut feelings into data that both technical and non-technical stakeholders can read. For a deeper look at how this connects to the broader conversation about technical debt visibility, that post covers the strategic framing.
The Essential Metrics
Not all code health metrics are equally useful. Some are easy to measure but tell you nothing actionable. Others are harder to collect but directly predict team velocity and risk. Here are the ones that matter.
Cyclomatic Complexity
Cyclomatic complexity measures the number of independent paths through a unit of code. A function with no branches has a complexity of 1. Every if, switch, loop, or exception handler adds a path.
Why it matters: high-complexity functions are harder to test, harder to modify, and more likely to contain bugs. A function with a cyclomatic complexity of 30 has 30 potential execution paths. Testing every path thoroughly is impractical, which means changes to that function carry high regression risk.
Track the distribution across your codebase, not just the average. A codebase with an average complexity of 5 might still have a handful of functions at 40+ that are ticking time bombs.
Code Churn
Code churn measures how frequently files are modified. High-churn files are areas of the codebase that are constantly being changed, often a signal that the code is either poorly structured, poorly understood, or both.
The metric becomes powerful when you overlay it with complexity. A high-complexity, high-churn file is your highest risk. It is hard to change correctly and it changes all the time. If that file is also owned by a single engineer, you have a triple threat: complexity, volatility, and knowledge concentration.
Test Coverage
Test coverage is the most commonly tracked code health metric and also the most commonly misunderstood. Coverage tells you what percentage of your code is exercised by automated tests. It does not tell you whether those tests are good.
80% coverage with meaningful assertions is far more valuable than 95% coverage where half the tests just call functions without checking the output. Track coverage as a baseline indicator, but do not use it as a standalone quality signal. Combine it with mutation testing or test failure rates for a more accurate picture.
Dependency Freshness
Outdated dependencies are a silent health risk. They accumulate security vulnerabilities, compatibility issues, and maintenance burden. Track the age of your dependencies, the number that are behind the latest version, and specifically the number with known CVEs.
Kong's 2024 developer survey found that dependency issues contribute to a 30% reduction in development speed for affected teams. That is a significant drag on velocity that often goes unmeasured.
Complexity, Churn, and Coverage
These three metrics form the core triangle of code health. Each is useful alone, but their intersections are where the real insights live.
The Danger Zones
Map your codebase on two axes: complexity (vertical) and churn (horizontal). You get four quadrants:
- Low complexity, low churn: Stable, well-structured code. Leave it alone.
- Low complexity, high churn: Actively developed, healthy areas. Normal.
- High complexity, low churn: Legacy code that works but is fragile. Monitor it.
- High complexity, high churn: This is your danger zone. Code that is both hard to change and constantly being changed. Prioritize refactoring.
Now overlay coverage on the danger zone quadrant. High complexity, high churn, low coverage? That is where your next production incident is brewing.
Knowledge Concentration Risk
Add one more dimension: who owns these files? If your danger zone code is maintained by a single engineer, you have what I call a "bus factor emergency." The system is fragile, changes frequently, is poorly tested, and only one person understands it.
This is where code health metrics connect directly to organizational risk. An engineering manager needs to know not just that a module is unhealthy, but who is carrying the risk and what happens when that person takes vacation or leaves.
Glue's knowledge risk mapping surfaces exactly this. It analyzes git history to identify which code modules are understood by only one or two people, and cross-references that with complexity and churn data. For engineering leaders who need to present code health to non-technical stakeholders, this is the bridge between technical metrics and business risk.
Practical Thresholds
Thresholds vary by language and team, but as starting points:
- Cyclomatic complexity: Flag anything above 15. Investigate anything above 25.
- Churn: Top 10% most-changed files deserve monthly review.
- Coverage: Below 60% on critical paths is a risk. Below 40% is urgent.
- Dependency age: More than 2 major versions behind on a core dependency is worth scheduling an upgrade.
These are not absolute rules. They are starting points to tune based on your team's context and tolerance.
Building a Health Dashboard
Metrics without visibility are just data. The goal is a dashboard that both engineering managers and non-technical leadership can use to make decisions.
What to Include
A useful code health dashboard has three layers:
Layer 1: Executive summary. A single health score or grade for the overall codebase, broken down by module or team. Use green/yellow/red or a simple numeric score. Leadership needs to see the headline without parsing individual metrics.
Layer 2: Metric detail. The actual metrics, complexity distribution, churn hotspots, coverage trends, dependency status, broken down by module, team, or service. This is where engineering managers spend their time.
Layer 3: Risk mapping. Knowledge concentration overlaid with health metrics. Which unhealthy modules are single-person dependencies? This connects code health to people risk.
Tooling Options
You can build a health dashboard from open-source tools. SonarQube handles complexity and coverage. CodeClimate provides churn analysis. Dependabot or Renovate track dependency freshness. The challenge is stitching these together into a coherent view.
Glue takes a different approach by indexing your codebase and providing code health visibility alongside feature discovery and codebase Q&A. The advantage is that health metrics are contextualized within a broader understanding of what the code does, not just how it looks. When you see that a module is unhealthy, you can immediately ask Glue what that module does and who depends on it.
Making the Dashboard Actionable
A dashboard that nobody acts on is just a decoration. Build these habits:
- Monthly health reviews. Dedicate 30 minutes in your monthly engineering sync to review the dashboard. Identify the top three risk areas.
- Sprint allocation. Reserve 10-20% of sprint capacity for health-driven work. Use the dashboard to direct that investment.
- Roadmap integration. When a feature touches a high-risk module, flag it early in planning. Use the health data to adjust estimates upward before engineering looks like it is padding.
The goal is not perfect code health. Every real-world codebase has debt. The goal is informed decision-making about where to invest maintenance effort, when to refactor, and how to communicate technical risk to the rest of the organization.
If you are an engineering manager looking for code health visibility that connects to the bigger picture of what your codebase does, see how Glue works and explore your own system's health profile.
Frequently Asked Questions
What are code health metrics?
Code health metrics are quantitative measurements of a codebase's internal quality, maintainability, and risk profile. Key metrics include cyclomatic complexity (how many paths through the code), code churn (how frequently files change), test coverage (what percentage of code is tested), and dependency freshness (how up-to-date your libraries are). Together, these metrics help engineering managers assess where technical risk concentrates and how to allocate maintenance effort.
How do you measure codebase health?
Codebase health is measured by combining multiple metrics rather than relying on a single score. Start with static analysis tools for complexity, coverage tracking from your CI pipeline, git history analysis for churn and knowledge concentration, and dependency scanning for freshness and vulnerabilities. The most useful approach maps these metrics against each other. A high-complexity, high-churn file with low coverage and a single owner represents a much higher risk than any individual metric would suggest. Platforms like Glue provide this cross-referenced view alongside broader codebase understanding.
What metrics should engineering managers track?
Engineering managers should prioritize metrics that connect code quality to business outcomes: cyclomatic complexity (predicts bug rates and change difficulty), code churn hotspots (identifies modules that consume disproportionate maintenance effort), test coverage on critical paths (measures regression risk), dependency freshness (signals security and compatibility exposure), and knowledge concentration (highlights people-dependent risk). According to Jellyfish, developers spend 23-42% of their time on technical debt, so the metrics that help you direct that investment efficiently will have the biggest impact on team velocity.