By Arjun Mehta, Principal Engineer at Glue
Your team's most critical service was written primarily by one person. You probably know exactly who I'm talking about. Maybe it's the payments system, the auth layer, or that internal data pipeline everyone depends on but nobody else fully understands. If that person gives two weeks' notice tomorrow, what happens? That question is what bus factor software engineering teams need to take seriously, before the answer becomes a crisis.
I've watched this scenario play out more times than I'd like to admit. It never ends well, and it's always preventable.
What Is Bus Factor?
Bus factor (also called truck factor, lottery factor, or the more optimistic "lottery number") is the minimum number of team members who would need to leave before a project stalls. If only one person understands how a system works, that system's bus factor is one.
The name comes from a morbid thought experiment: what happens if that person gets hit by a bus? But the reality is usually less dramatic and more common. People take new jobs. They go on parental leave. They get promoted to a different team. They burn out and take extended time off.
A bus factor of one doesn't mean something bad will happen. It means that when something normal happens, like someone moving on, your team is in trouble.
Here's a less discussed aspect of bus factor: it's not binary. A system doesn't just have "one person who knows it" or "everyone knows it." Knowledge distribution exists on a spectrum. Maybe two people can maintain the service, but only one understands the deployment pipeline. Maybe three people can write features, but nobody except the original author understands the database migration strategy.
The bus factor for a component is only as high as its least-distributed critical knowledge area. If you want a deeper understanding of how knowledge silos form in engineering teams, that's worth reading alongside this piece.
Why It Matters More Than You Think
Let's talk numbers, because the business impact of low bus factor is concrete.
Research from Cortex estimates that onboarding a new engineer to full productivity on an unfamiliar system takes 3 to 6 months and costs approximately $240,000 when you factor in the new hire's ramp time, the productivity drain on senior engineers doing knowledge transfer, and the opportunity cost of delayed work.
That's the planned scenario, where you have time to hire and onboard. When someone leaves suddenly and they were the only person who understood a critical system, the cost is much higher. You're not onboarding. You're reverse-engineering.
Stripe's 2023 developer survey found that engineers spend an average of 17 hours per week on maintenance tasks, including understanding and fixing code they didn't write. When the person who wrote it is gone and there's no documentation or tooling to fill the gap, that number goes up dramatically.
And every time someone interrupts a remaining team member to ask "how does this work?", Gloria Mark's research tells us it takes 23 minutes to get back to deep focus. In a low bus factor situation, those interruptions multiply. The people who do have partial knowledge become the new bottleneck, and their own productivity craters.
Low bus factor also creates invisible risk at the organizational level. I've seen teams delay critical refactoring because the one person who understood the old system was too busy to supervise the transition. I've seen releases pushed back because the engineer who owned the deployment process was on vacation. These aren't edge cases. They're Tuesday.
How to Measure Your Bus Factor
Most teams have an intuitive sense of where their bus factor risks are. You already know which systems make you nervous. But intuition isn't enough for prioritization. You need a way to measure it.
Git-based analysis is the most straightforward starting point. Look at commit history for each component or service. If 80% or more of the commits come from a single author, that's a bus factor signal. If a file or module hasn't been touched by more than one person in six months, flag it.
Code review coverage is another indicator. If only one person ever reviews changes to a particular area of the codebase, they're likely the only one who deeply understands it. Even if two people write code for a service, the bus factor is low if only one of them does the review.
Incident response patterns reveal a lot. When something breaks in a specific system, who gets paged? Who actually fixes it? If it's always the same person, that's your bus factor telling you something.
Onboarding friction is a lagging indicator but a powerful one. If new engineers consistently say "nobody can explain how X works" or "the docs for Y are outdated," you're looking at a bus factor problem that's already causing damage.
The challenge with all of these is that they require manual analysis. You're pulling data from git, cross-referencing it with team rosters, and trying to build a picture across dozens of services. This is where a code intelligence platform helps. Glue can automatically map knowledge distribution across your codebase, flagging components where ownership is concentrated and knowledge is siloed.
Strategies to Increase Bus Factor
Knowing your bus factor is step one. Fixing it is the harder, more important step. Here's what actually works.
Rotate code review assignments. This is the lowest-effort, highest-impact change you can make. If the same person always reviews changes to a service, rotate someone else in. Code review forces the reviewer to understand the code, ask questions, and build mental models. Over three to four review cycles, a second person develops real familiarity.
Pair on critical path work. When making significant changes to a low-bus-factor system, pair program it. Not as a formality, but as genuine collaboration where both engineers are engaged. The goal isn't to write code faster. The goal is to create a second person who understands what was done and why.
Write decision records, not just documentation. Traditional code documentation goes stale almost immediately. Architecture decision records (ADRs), which capture why a decision was made, not just what was decided, age much better. Six months from now, knowing that "we chose Postgres over DynamoDB because of X, Y, and Z" is more valuable than a class diagram that's three refactors behind.
Automate knowledge capture. This is where I'm biased, but it's also where I'm right. Manual knowledge transfer doesn't scale. People leave. Documentation rots. But tools that continuously analyze the codebase and surface knowledge automatically don't get tired, don't forget to update the wiki, and don't give two weeks' notice.
Glue's bus factor glossary entry breaks down the concept in more detail, but the platform itself is designed to solve this problem at its root. By making codebase knowledge queryable and continuously updated, Glue turns tribal knowledge into shared knowledge. New team members can ask "how does the auth service work?" and get an accurate, current answer. Engineers can explore unfamiliar parts of the codebase without interrupting the original author.
Create a knowledge distribution goal. Make bus factor a team metric. During sprint planning, look at the bus factor of the systems you're touching and intentionally assign work to increase knowledge spread. It's slower in the short term. It's dramatically safer in the long term.
Don't let heroes be heroes. In many teams, the single-point-of-failure engineer is also the most productive. They've accumulated so much context that they can ship faster than anyone else. That feels like an asset until they leave, and then it feels like a catastrophe. Resist the temptation to keep routing critical work to your fastest person. Distribute it deliberately, even when it feels inefficient.
The Real Cost of Doing Nothing
I want to close with a thought experiment. Pick your team's lowest-bus-factor system. The one where a single person holds most of the knowledge. Now imagine they resign tomorrow.
How long before someone else can make a meaningful change to that system? How many questions will go unanswered? How many sprints will be derailed while the remaining team reverse-engineers what should have been shared knowledge?
If those questions make you uncomfortable, you already know what needs to happen. Start measuring. Start distributing. And consider using a tool like Glue to capture and share the knowledge that currently lives in one person's head.
Your codebase is too important to exist in only one person's mind.
FAQ
What is bus factor in software?
Bus factor is the minimum number of people who would need to leave a team before a project or system can no longer be maintained. A bus factor of one means a single person holds all the critical knowledge for a component. If they leave, get sick, or change roles, the team has no one who fully understands that system. It's a measure of knowledge concentration risk, and most engineering teams have at least one component with a dangerously low bus factor.
How do you increase bus factor?
The most effective strategies are rotating code review assignments, pair programming on critical systems, writing architecture decision records, and intentionally distributing work across multiple engineers. You can also use code intelligence tools like Glue to automatically surface knowledge distribution patterns and make codebase knowledge accessible to the broader team. The key is treating bus factor as a team metric, not an afterthought, and building knowledge distribution into your planning process.
What happens when a key developer leaves?
When a key developer leaves a system with a low bus factor, the remaining team faces a significant knowledge gap. Research suggests onboarding a replacement to full productivity takes 3 to 6 months. In the interim, bug fixes take longer, feature development slows, and other engineers spend significant time reverse-engineering the departed engineer's work. The disruption compounds when the system is on the critical path, because every downstream team that depends on it also slows down. Prevention through knowledge distribution is far cheaper than recovery.