Building Systems That Can Be Explained Simply

Dhruval Dhameliya·May 18, 2025·7 min read

Why the ability to explain a system in simple terms is a design constraint, not a communication skill, and how to build systems that meet this standard.

If you cannot explain your system to a new engineer in 15 minutes, the system is too complex. This is not a communication problem. It is a design problem. The ability to explain a system simply is a direct measure of how well its architecture maps to the problem it solves.

Explainability as a Design Constraint

Most teams treat explanation as something that happens after the system is built. Write the code, then write the documentation, then explain it during onboarding. This gets the order wrong.

I treat explainability as a design constraint that shapes architectural decisions. Before committing to a design, I ask: "Can I draw this on a whiteboard in under five minutes?" If the answer is no, the design has unnecessary complexity or poorly chosen abstractions.

This does not mean every system should be trivial. It means the high-level architecture should be simple enough that a competent engineer can understand the data flow, the component responsibilities, and the failure boundaries from a brief explanation. The details are complex. The shape should be clear.

The Napkin Test

A system passes the napkin test if its architecture can be sketched on a napkin (or equivalent small surface) with these elements:

  • Major components (3-7 boxes, not 20)
  • Data flow between components (arrows with labels)
  • External dependencies (databases, APIs, third-party services)
  • The primary user interaction path

If the sketch requires more than 7 components, the system either has too many components or needs to be explained at a higher level of abstraction.

Systems that fail the napkin test:

  • Microservice architectures with 30+ services where no one can enumerate all the service-to-service dependencies
  • Event-driven systems where the event flow has circular dependencies or multiple fan-out stages
  • Systems with both synchronous and asynchronous paths for the same operation, chosen based on conditions that are not obvious

Related: Event Tracking System Design for Android Applications.

Naming as Architecture

The names you choose for components, services, endpoints, and data entities are the primary interface through which people understand the system. Poor naming creates a constant translation burden.

Naming rules I follow:

Use domain language, not technical language. Call it "OrderService" not "TransactionProcessingUnit." Call the database table "invoices" not "document_store_v3."

Names should be unambiguous within the system. If the system has both a "user" (the person using the product) and an "account" (the billing entity), never use these terms interchangeably. Define them once and use them consistently.

Avoid generic names. "Manager," "handler," "processor," "helper," and "utils" tell you nothing about what the component does. They are placeholders for naming decisions that were not made.

Rename when understanding improves. If the team has started calling the "EventProcessor" the "notification router" in conversation, rename the component. The code should match how the team talks about the system.

Layered Explanation

Complex systems need layered explanations, not simplified ones. Simplification loses information. Layering preserves it while controlling how much detail is presented at each level.

Layer 1: Purpose. What does this system do? One sentence. "It processes customer orders from placement through fulfillment."

Layer 2: Architecture. What are the major components and how do they interact? The napkin sketch. "Orders come in through the API, get validated, sent to a payment processor, and then queued for fulfillment."

Layer 3: Mechanics. How does each component work internally? This is where technical details live. "The payment processor uses a saga pattern with compensating transactions for partial failures."

Layer 4: Edge cases. What are the unusual behaviors? "If the payment processor times out, we hold the order in a pending state and retry three times over 15 minutes before canceling."

Each layer should make sense on its own. A new engineer should be productive with Layer 1 and Layer 2 knowledge. Layer 3 and Layer 4 come with experience.

Design Patterns That Aid Explanation

Certain architectural patterns are inherently easier to explain than others.

Pipeline (linear data flow). Data enters at one end, passes through a series of stages, and exits at the other end. Each stage has a clear input and output. Easy to explain, easy to debug, easy to extend.

Request-response. A client sends a request, the server processes it, the server returns a response. The entire interaction is contained in a single exchange. The simplest pattern to reason about.

Publish-subscribe. Events are published to a topic. Subscribers receive events they have registered interest in. The publisher does not know about the subscribers. This is harder to explain than request-response but easier than arbitrary event routing.

Patterns that resist explanation:

Event sourcing with CQRS. Two separate models (read and write), an event store, projections, eventual consistency. Each concept is individually simple but the combination creates a system that takes significant time to explain.

Choreographed sagas. Multiple services react to events without a central coordinator. Understanding the overall flow requires tracing events across services. No single place shows the complete picture.

These patterns are sometimes necessary. But when they are used, the cost of explanation should be acknowledged and mitigated with documentation, diagrams, and tooling that visualizes the event flow.

The Documentation Minimum

Every system needs, at minimum:

  1. A one-paragraph description of what it does and why it exists.
  2. An architecture diagram showing components, data flow, and external dependencies. Updated within one sprint of any architectural change.
  3. A glossary defining domain terms used in the codebase.
  4. A "how to run it locally" guide that works. Test it quarterly by having someone follow the steps.
  5. An operations guide covering deployment, rollback, scaling, and common failure scenarios.

This is a minimum, not a recommendation. Systems without these artifacts create an ongoing tax on every engineer who interacts with them.

Simplicity Through Constraints

Counter-intuitively, constraints often make systems easier to explain:

See also: Building a View Counter System With Postgres.

  • "All inter-service communication uses HTTP REST" is easier to explain than "some services use REST, some use gRPC, and some use a custom protocol."
  • "Every service uses PostgreSQL" is easier to explain than "some services use PostgreSQL, some use MongoDB, and one uses DynamoDB."
  • "Events are processed in order within a partition" is easier to explain than "events are processed in order except when they are not, depending on the consumer group configuration."

Standardization reduces the number of concepts someone needs to learn. Each additional technology, protocol, or pattern adds to the explanation burden.

Key Takeaways

  • Explainability is a design constraint, not a documentation task. If the architecture cannot be sketched on a napkin, it is too complex.
  • Use domain language for naming. Avoid generic names like "manager," "handler," and "processor."
  • Layer explanations: purpose, architecture, mechanics, edge cases. Each layer should stand on its own.
  • Linear data flow patterns (pipelines, request-response) are inherently easier to explain than event-driven choreography.
  • Standardization on technology and patterns reduces the number of concepts that need explanation.
  • Every system needs at minimum: a one-paragraph description, an architecture diagram, a glossary, local setup instructions, and an operations guide.

Further Reading

Final Thoughts

The ability to explain a system simply is not about dumbing it down. It is about the system's architecture genuinely mapping to the problem domain in a way that is comprehensible. When a system resists simple explanation, it is usually because the architecture has accumulated incidental complexity that does not serve the problem. Removing that complexity makes the system both easier to explain and better.

Recommended