Designing With Future Engineers in Mind
How to build systems that new team members can understand, modify, and operate safely without requiring extensive tribal knowledge transfer.
The engineer who will maintain your system in two years is not in the room when you design it. They may not be at the company yet. They will not have the context you have today, the Slack threads you remember, or the incident retrospectives that shaped your decisions. Designing for them is one of the most impactful things a senior engineer can do.
Related: Designing Event Schemas That Survive Product Changes.
See also: Designing a Feature Flag and Remote Config System.
The Knowledge Bus Factor
Every system has implicit knowledge: why a particular timeout value was chosen, why the retry logic has that specific backoff curve, why the service talks to this database instead of that one. This knowledge lives in the heads of the engineers who built it.
When those engineers leave, the knowledge leaves with them. The system still runs, but the team operating it no longer understands why it was built this way. Changes become risky because nobody knows which assumptions are load-bearing.
The goal is not to eliminate implicit knowledge. That is impossible. The goal is to reduce the gap between what the code expresses and what you need to know to operate it safely.
Architecture Decision Records
Every significant decision gets an ADR. The format I use:
# ADR-{number}: {title}
## Status: Accepted | Deprecated | Superseded by ADR-{n}
## Context
What situation prompted this decision? What constraints existed?
## Decision
What did we decide and why?
## Consequences
What are the trade-offs? What becomes easier? What becomes harder?
## Alternatives Considered
What did we not choose and why?
The "Alternatives Considered" section is the most valuable part. It tells future engineers not just what you chose, but what you explicitly rejected and why. This prevents them from revisiting decisions that have already been analyzed.
I store ADRs in the repository alongside the code. They are versioned, reviewable, and discoverable. A wiki page about architecture gets stale. An ADR next to the code it describes stays relevant because it is visible during code changes.
Self-Documenting Code Patterns
Comments explain "why," but code structure explains "how." Design patterns that reduce the need for external documentation:
Explicit naming. A function called calculateShippingCostWithTaxForDomesticOrders is long. It is also unambiguous. I prefer verbose names that require no context to understand over short names that require a comment.
Constants with context. Instead of TIMEOUT = 5000, use DATABASE_CONNECTION_TIMEOUT_MS = 5000. Better yet, add a comment explaining why 5000ms and not 3000ms or 10000ms.
Error types that describe the failure. InsufficientInventoryError tells you more than BusinessLogicError. The error type itself is documentation.
Structured configuration. Group related configuration values together with descriptive section names. Use comments in configuration files to explain non-obvious values and their valid ranges.
Onboarding-Driven Design
I evaluate system design through the lens of onboarding. If a new engineer joins the team, how long before they can:
- Understand the system's purpose and boundaries (target: 1 day)
- Make a minor change and deploy it safely (target: 1 week)
- Diagnose and fix a production issue (target: 2 weeks)
- Design and implement a new feature (target: 1 month)
If any of these timelines are significantly longer, the system design is creating an unnecessary knowledge barrier.
Practices that shorten onboarding:
- A single entry point. One README that explains what the system does, how to run it locally, and where to find more detailed documentation.
- Consistent patterns. Every endpoint follows the same structure. Every service uses the same error handling approach. Consistency means learning one pattern teaches you all of them.
- Runbooks for common operations. Step-by-step instructions for deploying, rolling back, scaling, and debugging. Not because engineers cannot figure it out, but because figuring it out under pressure wastes time.
Guardrails Over Guidelines
Guidelines say "you should do X." Guardrails make it impossible (or at least difficult) to do not-X.
| Guideline | Guardrail |
|---|---|
| "Always include a correlation ID in logs" | Logging library adds correlation ID automatically |
| "Validate inputs before processing" | Framework rejects requests that fail schema validation |
| "Do not deploy on Fridays" | CI/CD pipeline blocks Friday deployments |
| "Always use parameterized queries" | ORM does not expose raw query interface |
| "Test coverage must exceed 80%" | Build fails below coverage threshold |
Guidelines require discipline. Guardrails require intention to circumvent. Future engineers will follow guardrails even when they do not understand the reasoning, which is exactly when guidelines fail.
Reducing Tribal Knowledge Dependencies
Specific practices I follow to minimize tribal knowledge:
Annotate magic numbers and thresholds. Every hardcoded value in the codebase should have a comment explaining its origin. "This timeout was set to 30 seconds based on P99 latency measurements from 2024-Q3" is infinitely more useful than just 30000.
Document failure modes and their mitigations. A comment above a circuit breaker configuration that says "Service X has a known failure mode where it returns 200 OK with empty bodies during high load. This circuit breaker treats empty bodies as failures" saves hours of debugging.
Record the rationale for non-obvious dependencies. If your service depends on a specific version of a library due to a bug in later versions, document the bug and the version constraint. Otherwise, someone will upgrade it.
Maintain a glossary of domain terms. If your system uses terms like "settlement," "reconciliation," or "hydration" in domain-specific ways, define them. Different engineers bring different definitions of common terms.
Code Reviews as Knowledge Transfer
Code reviews are not just about catching bugs. They are the primary mechanism for distributing system knowledge across the team. I use reviews to:
- Ask "why" questions that force the author to document their reasoning
- Suggest adding comments for non-obvious decisions
- Ensure the change follows established patterns (or explicitly justifies deviating)
- Verify that new team members can understand the change without verbal explanation
A code review that results in "LGTM" without comments is a missed knowledge transfer opportunity.
Key Takeaways
- Implicit knowledge leaves when engineers leave. Architecture Decision Records preserve the reasoning behind significant choices.
- Self-documenting code uses explicit names, contextual constants, and descriptive error types to reduce reliance on external documentation.
- Evaluate system design through onboarding timelines. If a new engineer cannot make a safe change within a week, the design is creating unnecessary barriers.
- Guardrails are more reliable than guidelines because they work even when the engineer does not understand the reasoning.
- Code reviews are knowledge transfer opportunities. Use them to distribute understanding across the team.
- Annotate every magic number, non-obvious dependency, and domain-specific term.
Further Reading
- Designing for Change Without Over-Engineering: How to build systems that accommodate future changes without building unnecessary abstraction layers, speculative features, or premature ...
- Designing APIs With Mobile Constraints in Mind: How to design backend APIs that account for mobile-specific constraints: bandwidth, latency, battery, intermittent connectivity, and long...
- Designing Systems I'd Be Proud to Maintain: The design principles I follow to build systems that are not just functional but genuinely pleasant to maintain, debug, and evolve over t...
Final Thoughts
The best compliment a system can receive is "I joined the team last month and I already feel productive." That does not happen by accident. It happens because someone designed the system, the tooling, and the documentation with the assumption that the people operating it would not have the same context as the people who built it. That assumption is always correct.
Recommended
Designing an Offline-First Sync Engine for Mobile Apps
A deep dive into building a reliable sync engine that keeps mobile apps functional without connectivity, covering conflict resolution, queue management, and real-world trade-offs.
Jetpack Compose Recomposition: A Deep Dive
A detailed look at how Compose recomposition works under the hood, what triggers it, how the slot table tracks state, and how to control it in production apps.
Event Tracking System Design for Android Applications
A systems-level breakdown of designing an event tracking system for Android, covering batching, schema enforcement, local persistence, and delivery guarantees.