Glue

AI codebase intelligence for product teams. See your product without reading code.

Product

  • How It Works
  • Benefits
  • For PMs
  • For EMs
  • For CTOs

Resources

  • Blog
  • Guides
  • Glossary
  • Comparisons
  • Use Cases

Company

  • About
  • Authors
  • Support
© 2026 Glue. All rights reserved.
RSS
Glue
For PMsFor EMsFor CTOsHow It WorksBlogAbout
BLOG

I Analyzed 50 Codebases - Here Are the 5 Patterns That Predict Tech Debt

Tech debt doesn't appear suddenly. It leaves signatures.

SS
Sahil SinghFounder & CEO
July 1, 202611 min read

Tech debt doesn't appear suddenly. It leaves signatures.

Over the past eighteen months, I've analyzed over 50 codebases - everything from Django startups to polyglot fintech systems to legacy enterprise monoliths. I was looking for patterns: early warning signs that would predict which codebases would accumulate technical debt, slow down development velocity, and eventually become unmaintainable.

What I found is that tech debt doesn't sneak up on you. There are measurable patterns in your code, commit history, and development practices that reliably predict where tech debt will accumulate. If you know what to look for, you can catch these patterns early and prevent them from becoming crises.

Here are the five patterns that appeared across nearly every codebase that eventually became a tech debt problem.

Pattern 1: High-Churn Files With Many Authors (Knowledge Fragmentation)

Threshold: A file modified by more than 10 unique authors over the past 12 months, with an average of 20+ commits per author.

This is the most reliable early warning signal I found.

Here's what's happening: When many different developers touch the same file, they're implicitly saying "nobody really owns this." Without a single person (or small team) responsible for the file's design, multiple incompatible mental models get layered into the code. Each author adds features their way. Over time, the file becomes a patchwork of different styles, patterns, and assumptions.

In my analysis, files with this pattern showed:

  • 50% higher cyclomatic complexity than average
  • 3x slower code review cycles (reviews took longer because reviewers had to understand multiple competing design approaches)
  • 40% more bug fixes in the following quarter

Real example: In a Python codebase I analyzed, user_service.py had been touched by 14 different developers. It was 800 lines long, with four different approaches to user validation, inconsistent error handling, and three separate caching strategies. When I mapped the commit authors and their tenure, I found that each developer had added their own "solution" to the same problems because they didn't understand the previous solution.

Why this matters: High-churn, multi-author files are where organizational knowledge becomes fragmented. New developers can't find a coherent design to learn from. Code reviews become debates about "which approach is right" instead of catching actual bugs. The file grows in complexity with every change, and nobody fully understands it.

How to detect it:

git log --pretty=format:"%an" -- path/to/file.js | sort | uniq -c | wc -l

A count above 10 is a warning sign.

What to do about it:

  • Identify the primary owner and give them authority over the file's design.
  • Extract the file into smaller, single-purpose modules. High-churn files are usually doing too many things.
  • Document the current design decisions and alternative approaches that have been attempted.
  • Establish a code review requirement that the primary owner approves changes.
  • Schedule a refactoring sprint specifically for this file with the owner leading.

Pattern 2: Functions Longer Than 200 Lines That Keep Growing (Complexity Accumulation)

Threshold: Any function exceeding 200 lines that has grown more than 50 lines in the past 6 months.

Functions under 50 lines have lower bug density. Functions between 50-100 lines are manageable. Functions over 200 lines are a red flag. Functions over 200 lines that are still growing are an active tech debt accumulation point.

Here's why: When a function is long and complex, adding a small feature is easier than refactoring. You find the right spot in the 300-line function and add five lines. Now it's 305 lines. Three months later, someone adds another edge case. Now it's 320 lines. The function never gets broken down because the alternative (refactoring) always seems more risky than "just adding a feature."

I reviewed one Java codebase where the main request handler function was 680 lines and had been growing steadily: 450 lines 18 months ago, 550 lines 12 months ago, 680 lines now.

The damage:

  • Cyclomatic complexity of 47 (ideally below 10)
  • Coverage by exactly two unit tests (impossible to test all branches)
  • Average review time of 3 hours for any change touching this function
  • 7 bugs fixed in the last quarter specifically in this function

How to detect it:

// In your IDE, use tools like SonarQube, Pylint, or ESLint with complexity rules
// Or manually: select the function and count brackets/conditionals

Most linters will flag functions over 200 lines. The key is catching them while they're still growing.

What to do about it:

The solution is forced refactoring. You can't incrementally fix a 680-line function. You need to:

  1. Extract pure functions. Pull out any logic that doesn't depend on state into separate, testable functions.
  2. Use the Strategy pattern. If the function handles multiple cases, break them into separate strategy classes.
  3. Extract helper functions. Even if the helpers are only called once, extracting them makes the main function more readable and testable.
  4. Break into smaller classes. Sometimes a 680-line function is telling you that you need multiple classes.

Set a hard rule: maximum 100-150 lines per function. Enforce it in code review.


Pattern 3: Test Coverage Declining Quarter-Over-Quarter (Quality Erosion)

Threshold: Test coverage declining more than 5% in a single quarter.

Test coverage isn't a perfect metric - 100% coverage doesn't mean your tests are good, and 80% coverage doesn't mean they're bad. But the direction of coverage change is extremely meaningful.

Rising coverage suggests the team is being disciplined about testing new code. Declining coverage suggests the opposite: the team is shipping code they're not testing. Even if they're not intentionally doing this, it usually means they're under time pressure, and they're cutting the easiest corner: the tests.

In my analysis, teams with declining test coverage showed:

  • 3.2x higher bug escape rate to production
  • Slower onboarding of new developers (they couldn't trust the tests to tell them what code should do)
  • More frequent "emergency" rollbacks
  • Higher refactoring anxiety (teams didn't trust tests to catch regressions)

One team I tracked had test coverage of 82% in Q1. By Q3, it was 71%. In Q4, they had a critical production bug that took 14 hours to diagnose and fix. The root cause? A change to a utility function that had been tested in Q1 but test coverage had been allowed to erode.

Why this matters: Declining coverage is an early indicator that your team has lost confidence in your test suite, or that they're shipping without validating their changes. Either way, it's a leading indicator of bugs and rework.

How to detect it:

# Track coverage over time
./run_coverage.sh > coverage_$(date +%Y%m%d).txt
# Compare quarter-to-quarter

Most CI systems (GitHub Actions, CircleCI, GitLab CI) can track and report on coverage trends.

What to do about it:

  • Set coverage targets per directory, not per codebase. You might require 90% on core libraries and 70% on UI code.
  • Make coverage part of merge checks. If a PR lowers coverage, require justification.
  • Investigate why coverage is falling. Is the team under pressure? Do they not understand how to test a feature? Are the tests too brittle? The coverage number is telling you something is wrong; investigate what.
  • Focus on testing risky code. Don't force 100% coverage on trivial getters. Focus coverage on business logic, edge cases, and code that's frequently modified.

Pattern 4: Dependency Age More Than 2 Years Behind Latest (Update Debt)

Threshold: Critical dependencies more than 24 months behind the latest stable release, or more than 3 major versions out of date.

I found this pattern in about 40% of the codebases I analyzed. Teams would let dependencies drift, partly from neglect and partly from fear: "If we upgrade Rails, will our app break?"

The consequences:

  • Security vulnerabilities in old dependencies (this one is serious)
  • Performance degradation (new versions are often optimized)
  • Incompatibility with new team members' tools (you can't use newer linters, formatters, or type checkers with old dependencies)
  • Increased effort for major upgrades (jumping from React 15 to React 18 is harder than a series of incremental upgrades)

One codebase I reviewed was running Express 4.16 (released in 2018) while the latest stable was 4.19. It sounds minor, but the codebase couldn't use modern middleware, and developers trying to add new features had to work around Express limitations that had been solved years ago.

How to detect it:

# Node/npm
npm outdated

# Python
pip list --outdated

# Ruby
bundle outdated

# Rust
cargo outdated

What to do about it:

  • Establish a dependency update cadence. Once per month or quarter, systematically update dependencies. Smaller, frequent updates are easier than large, infrequent ones.
  • Automate dependency updates for patches. Use Dependabot or Renovate to automatically create PRs for patch and minor version updates.
  • Test dependency upgrades in CI. Have your test suite run against the new version before merging.
  • Prioritize security updates. If a critical dependency has a security patch, update immediately.
  • Set a policy. "No major version more than 2 years old." Make it a team standard, not optional.

Pattern 5: Commit Messages Mentioning "Workaround," "Hack," "TODO," or "FIXME" (Acknowledged Shortcuts)

Threshold: More than 3-5 commits per month with these keywords in the message or code comments.

This pattern is interesting because it's a signal of acknowledged debt. Developers know they're taking shortcuts, and they're leaving breadcrumbs: "We know this isn't right. We'll fix it later."

The problem: they rarely do.

I analyzed commit message history across 50 codebases and found that commits mentioning "hack," "workaround," or "temporary" were highly predictive of future bugs. Teams that had more acknowledged shortcuts ended up with:

  • More production incidents related to those "temporary" changes
  • Slower feature development (the workarounds tangled with new features)
  • Higher cognitive load for developers (they had to remember the shortcuts)

One example: A team wrote "HACK: Temporary fix for race condition" and intended to refactor it. Eighteen months later, that "temporary" fix was still there, was now called from five different places, and had actually become the API. When they finally tried to fix it properly, it required changes across five different services.

How to detect it:

# Search commit messages
git log --all --grep="hack\ | workaround\ | FIXME\ | TODO" --oneline | wc -l

# Or search the codebase
grep -r "FIXME\ | TODO\ | HACK\ | XXX" src/ | wc -l

What to do about it:

  • Catalog the shortcuts. Create a issues list of every acknowledged hack in your codebase.
  • Assign owners. Each hack needs an owner who committed to fixing it. Make them visible.
  • Set deadlines. "We'll refactor this race condition workaround in Q2." Give it a timeline, not indefinite deferral.
  • Don't allow new hacks without an issue. If a developer writes a workaround, require them to create a tracked issue for fixing it.
  • Actually fix them. Block at least 10% of sprint capacity for fixing the hacks you've acknowledged. Otherwise they're not actually commitments.

The Bigger Picture: These Patterns Are Connected

The most important insight from analyzing these 50 codebases is that these patterns don't exist in isolation. They're correlated:

  • High-churn files tend to grow functions, because without coherent ownership, changes accumulate.
  • Growing functions tend to have declining test coverage, because adding tests to a 300-line function is hard.
  • Codebases with declining test coverage tend to accumulate more hacks, because developers lose confidence in preventing regressions.
  • And codebases with many acknowledged hacks tend to neglect dependency updates, because teams become too focused on managing technical debt to proactively maintain.

This means you have a window to intervene. When you see Pattern 1 (high-churn files), you can prevent Patterns 2-5 by establishing clear ownership, breaking the file down, and investing in tests. When you catch Pattern 3 (declining coverage), you can prevent further accumulation by investigating what's causing the pressure.

How to Monitor These Patterns

The best approach is to measure these patterns continuously:

  1. Set up automated detection for each pattern using your CI/CD pipeline.
  2. Make the metrics visible to the team - not to shame anyone, but to surface problems early.
  3. Establish threshold alerts. When a file hits 10 authors, or a function grows over 200 lines, notify the team.
  4. Investigate the root cause, not just the metric. If test coverage drops, ask why before blaming the team.

Tools that analyze your actual codebase structure - measuring file churn, function complexity, test coverage trends, and dependency age - are far more useful than velocity metrics or story points for predicting your real technical health.


These patterns aren't mysterious. They're visible in your code right now. The teams that catch them early - by systematically analyzing code quality metrics and codebase structure - tend to stay fast. The teams that ignore them until they become crises spend the next year paying down debt. Which team do you want to be?

[ AUTHOR ]

SS
Sahil SinghFounder & CEO

SHARE

RELATED

Keep reading

blogJun 24, 202611 min

Technical Debt Is Not a Metaphor - Here's How to Put a Dollar Figure on It

Ward Cunningham introduced the technical debt metaphor in 1992, and it was useful. Talking about debt helped engineering teams communicate with business stakeholders about the cost of shortcuts. But metaphors have limits. The moment you want to make an actual decision about whether to refactor or ship the next feature, a metaphor breaks down. You need numbers.

SS
Sahil SinghFounder & CEO
blogJun 26, 202613 min

The Bus Factor Problem: What Happens When Your Best Engineer Leaves

Your lead backend engineer walks into your office on a Tuesday morning and tells you they're leaving. Two weeks notice. They found a new opportunity. They're excited about it.

SS
Sahil SinghFounder & CEO
blogJul 3, 20267 min

How to Convince Your CTO to Invest in Developer Experience

You know the problem. Your team loses two hours a day to slow builds. New engineers take three weeks to understand the codebase. Your CI/CD pipeline feels like it was built in 2015. And when developers finally ship code, half the bugs should have been caught earlier.

SS
Sahil SinghFounder & CEO

See your codebase without reading code.

Get Started — Free