When to Rewrite vs Refactor

The rewrite question comes up in every long-lived codebase. The system is painful to work with, velocity has slowed, and someone suggests starting over. The decision is consequential: a wrong rewrite can set a team back by years, and a wrong refusal to rewrite can trap a team in a system that cannot evolve. Here is how I approach it.

The Default Is Refactor

Refactoring preserves working behavior while improving internal structure. It is incremental, reversible, and carries lower risk than a rewrite. My default position is always refactor unless there is a specific, compelling reason to rewrite.

The reasons this default exists:

Working software contains encoded knowledge. Every bug fix, every edge case handler, every "weird" conditional exists because someone encountered a real scenario. A rewrite loses this accumulated knowledge.
Rewrites take 2-3x longer than estimated. This is not a rule of thumb. It is a pattern I have observed consistently across projects, teams, and technology stacks.
The team must maintain the old system during the rewrite. Development effort is split between the old system (which still needs bug fixes and features) and the new system. Neither gets full attention.
Feature parity is a moving target. By the time the rewrite catches up to the old system's features, the old system has accumulated new features that the rewrite does not yet support.

Signals That a Rewrite Is Justified

Despite the default, there are situations where refactoring is not viable:

The technology platform is end-of-life. If the language, framework, or runtime is no longer maintained and has known security vulnerabilities that cannot be patched, a rewrite may be the only option. Refactoring within a dead ecosystem does not solve the underlying problem.

The architecture fundamentally cannot meet a hard requirement. A single-tenant system that must become multi-tenant. A monolith that must support independent deployment of its components due to regulatory requirements. A synchronous pipeline that must process events in real-time. When the gap is architectural rather than implementational, refactoring may be insufficient.

The codebase cannot be tested. If the system has no test coverage, no separation of concerns, and the code is so entangled that adding tests requires rewriting the code anyway, the distinction between "refactor" and "rewrite" becomes semantic.

The team has zero domain experts for the current implementation. If everyone who understood the system has left and the current team cannot safely make changes, the system is already effectively unmaintained. A rewrite with the current team may be faster than reverse-engineering the existing system.

The Strangler Fig Pattern

When a rewrite is justified, I never do a big-bang replacement. The strangler fig pattern replaces the old system incrementally:

Identify a bounded piece of functionality in the old system.
Build the replacement for that piece using the new architecture.
Route traffic to the new implementation, keeping the old one as fallback.
Verify correctness and performance.
Remove the old implementation for that piece.
Repeat for the next piece.

This gives you the benefits of a rewrite (new architecture, clean code) with the risk profile of a refactor (incremental, reversible, always shippable).

The key constraint: you must be able to draw clean boundaries around pieces of functionality. If the old system is so entangled that you cannot isolate a piece without rewriting everything, the strangler fig pattern does not work. In that case, you may need to refactor enough to create boundaries before you can start the incremental rewrite.

Decision Framework

I use the following evaluation:

Factor	Favors refactor	Favors rewrite
Codebase health	Tests exist, modules are separable	No tests, everything is coupled
Team knowledge	Team understands the existing system	Nobody understands the existing system
Platform viability	Platform is actively maintained	Platform is deprecated or insecure
Business tolerance	Cannot pause feature delivery	Can invest a quarter in infrastructure
Architectural fit	Architecture can evolve to meet needs	Architecture fundamentally cannot meet needs
Risk tolerance	Low (system is revenue-critical)	Higher (non-critical or has fallback)

If the evaluation is mixed (some factors favor refactor, others favor rewrite), I default to refactor. The cost of an unnecessary refactor is wasted effort. The cost of a failed rewrite is a multi-month or multi-year setback.

Common Rewrite Mistakes

Reproducing the old system's mistakes. Without understanding why the old system was built the way it was, the new system often recapitulates the same decisions. I require the team to document the old system's design rationale before starting the rewrite.

Gold-plating the rewrite. "Since we are rewriting anyway, let's also add X, Y, and Z." Every additional feature increases the time to parity and the risk of never finishing. The rewrite's scope should be strictly limited to achieving feature parity with a better architecture.

Underestimating data migration. The old system's data model contains years of accumulated inconsistencies, edge cases, and schema variations. Migrating this data to a new model is often the hardest part of the rewrite and is frequently underestimated by a factor of 3 or more.

Neglecting the transition period. During the transition, users interact with both old and new systems. Data must be synchronized. Bugs must be fixed in both systems. Support teams must understand both systems. The operational cost of the transition period is significant and must be planned for.

Refactoring Strategies That Avoid Rewrites

When the system is painful but does not justify a rewrite, these strategies provide relief:

Extract and replace modules. Identify the most painful module, define its interface, build a replacement behind the interface, and swap it in. This is a mini-rewrite within a refactor.

Add a testing layer. Before changing anything, add characterization tests that capture the current behavior. These tests allow refactoring with confidence that behavior is preserved.

Introduce an anti-corruption layer. When the old system's internal model is problematic, add a translation layer at the boundary. New code interacts with a clean model. The anti-corruption layer translates between the clean model and the legacy model.

Pay down debt incrementally. Allocate a fixed percentage of each sprint (15-20%) to technical debt reduction. This is sustainable and avoids the feast-or-famine cycle of debt accumulation followed by emergency rewrites.

The Rewrite Checklist

If the decision is to rewrite, verify these conditions before starting:

The team has documented why the current system cannot be refactored
The scope is limited to feature parity (no new features in v1)
A strangler fig or incremental approach is planned (no big-bang cutover)
Data migration has been prototyped with production-like data
The team has capacity to maintain the old system during the transition
Success criteria and timeline are defined (with a kill date if targets are not met)
Stakeholders understand and accept the investment and the risk

The kill date is critical. A rewrite that is 18 months in with no end in sight should be evaluated for cancellation. Continuing is not always the right choice just because you have already invested.

Key Takeaways

Default to refactoring. Rewrites take 2-3x longer than estimated and lose encoded knowledge from the existing system.
Rewrites are justified when the platform is end-of-life, the architecture fundamentally cannot meet requirements, or the codebase cannot be tested.
Use the strangler fig pattern for rewrites: incremental replacement, not big-bang cutover.
Common rewrite mistakes: reproducing old design decisions, gold-plating, underestimating data migration, and neglecting the transition period.
Allocate 15-20% of each sprint to technical debt reduction to avoid reaching the rewrite threshold.
Set a kill date for rewrites and evaluate honestly whether to continue.

Final Thoughts

The rewrite temptation is strongest when frustration is highest. That is exactly when the decision should be most carefully evaluated. The best engineering teams I have worked with treat the rewrite question with the same rigor as any other architectural decision: defined criteria, documented trade-offs, and a clear plan for managing risk.

When to Rewrite vs Refactor

The Default Is Refactor

Signals That a Rewrite Is Justified

The Strangler Fig Pattern

Decision Framework

Common Rewrite Mistakes

Refactoring Strategies That Avoid Rewrites

The Rewrite Checklist

Key Takeaways

Further Reading

Final Thoughts

Recommended

Designing an Offline-First Sync Engine for Mobile Apps

Jetpack Compose Recomposition: A Deep Dive

Event Tracking System Design for Android Applications