Designing a Simple CMS From Scratch

Dhruval Dhameliya·October 18, 2025·6 min read

Architecture decisions behind building a file-based CMS with MDX, Git-backed versioning, and incremental builds for a content-heavy site.

Context

I needed a CMS for a technical blog. Requirements: MDX support, Git-based version control, fast builds, structured frontmatter validation, and zero runtime dependencies. No database. No admin panel. Just files, a schema, and a build pipeline.

Problem

Existing headless CMS options (Contentful, Sanity, Strapi) add network dependencies, API rate limits, and vendor lock-in. WordPress is operationally expensive for a static site. A file-based approach eliminates these issues but requires building content ingestion, validation, and rendering from scratch.

See also: Building a Simple Search Index.

Constraints

  • Content format: MDX (Markdown with JSX components)
  • Storage: Git repository (GitHub)
  • Build tool: Next.js with static export
  • Frontmatter schema: must be validated at build time, not runtime
  • Build time budget: under 60 seconds for 500 posts
  • Authors: 1-3 contributors, all comfortable with Git
  • No runtime API calls for content

Design

Content Structure

src/content/
  blog/
    post-slug.mdx
  projects/
    project-slug.mdx
  config/
    navigation.json
    site.json

Each MDX file contains frontmatter and content:

---
title: "Post Title"
description: "A short description"
date: "2025-10-18"
tags: ["architecture"]
draft: false
---
 
Content here with <CustomComponent /> support.

Schema Validation

Zod schemas validate frontmatter at build time:

const blogSchema = z.object({
  title: z.string().min(1).max(120),
  description: z.string().min(1).max(300),
  date: z.string().regex(/^\d{4}-\d{2}-\d{2}$/),
  tags: z.array(z.string()).min(1).max(5),
  draft: z.boolean().default(false),
});

A build script runs validation before the Next.js build. Any schema violation fails the build with a clear error message pointing to the offending file and field.

Content Pipeline

StageToolOutput
Parse frontmattergray-matterStructured metadata
Validate schemaZodPass/fail with errors
Compile MDXnext-mdx-remoteSerialized React tree
Generate slugsFile path conventionURL structure
Build indexCustom scriptJSON feed for search, RSS

Component Resolution

Custom MDX components are registered in a central map:

const components = {
  Callout: CalloutComponent,
  CodeBlock: CodeBlockComponent,
  Table: TableComponent,
  Image: OptimizedImage,
};

This keeps content portable. If a component is removed, the build fails with a clear reference error rather than silently rendering broken HTML.

Trade-offs

AspectFile-Based CMSHeadless CMS (API)Database CMS
Content editingCode editor + GitWeb UIWeb UI
Version controlGit (native)Vendor-specificManual/plugins
Build dependencyNone (local files)Network + APINetwork + DB
Schema enforcementBuild-timeRuntime/webhookRuntime
PreviewLocal dev serverPreview APIAdmin panel
Contributor frictionHigh (Git knowledge)LowLow
Vendor lock-inNoneHighMedium
Cost at scale$0$50-500/month$20-100/month

The primary trade-off is contributor friction. Non-technical writers struggle with Git workflows. For a team of engineers writing technical content, this is not a constraint.

Build Performance

Post CountFull BuildIncremental (1 post changed)
10012s4s
50048s5s
1,00095s6s
5,000380s8s

Incremental builds leverage Next.js ISR and file-system watching to rebuild only changed content. The build script hashes each file and compares against a cached manifest.

Related: Failure Modes I Actively Design For.

Failure Modes

Broken MDX syntax: An unclosed JSX tag in an MDX file crashes the entire build. Mitigation: a pre-commit hook that compiles each changed MDX file in isolation. Failures block the commit.

Frontmatter drift: Over time, authors add non-schema fields or use incorrect types. The Zod validation catches this, but only at build time. If CI is slow, authors get feedback minutes after pushing. Mitigation: run validation as a pre-commit hook (sub-second for changed files).

Image reference rot: MDX files reference images by path. If an image is moved or deleted, the build does not fail (images are resolved at runtime by the browser). Mitigation: a build script that checks all image references against the filesystem.

Git merge conflicts in MDX: Two authors editing the same file creates merge conflicts in content. MDX conflicts are harder to resolve than code conflicts because the diff context is prose. Mitigation: one file per post, small atomic commits, and a convention that each author works on separate posts.

Scaling Considerations

  • At 5,000+ posts, full builds exceed the 60-second budget. The solution is to never run full builds. Incremental builds handle single-post changes in under 10 seconds regardless of corpus size.
  • For multi-author workflows, consider a Git-based review process (pull requests for content) with automated preview deployments per PR.
  • Content search requires a pre-built index. At 5,000 posts, the search index is ~2MB. This should be server-side (see search post for details).
  • RSS and sitemap generation should be part of the build pipeline, not a runtime concern.

Observability

  • Build time per stage (parse, validate, compile, generate) logged to stdout and captured in CI
  • Schema validation errors surfaced as GitHub Actions annotations on the PR
  • Content statistics (word count, reading time, tag distribution) generated at build time and written to a JSON manifest
  • Broken link detection as a post-build step using a crawler against the static output

Key Takeaways

  • File-based CMS eliminates runtime dependencies entirely. Content is an artifact of the build, not a service.
  • Schema validation at build time catches errors earlier and more reliably than runtime validation.
  • Incremental builds are essential for scaling. Full rebuilds are only for CI verification.
  • The contributor experience is the main constraint. This approach works for technical teams, not content marketing departments.
  • Pre-commit hooks are the most effective quality gate. Build-time validation is the second line of defense.

Further Reading

Final Thoughts

This CMS has served 500+ posts over 18 months with zero runtime incidents. The entire content pipeline runs in CI, produces static HTML, and deploys to a CDN. There is no server to monitor, no database to back up, and no API to rate-limit. The trade-off is contributor experience, which is acceptable when all contributors are engineers who already live in Git.

Recommended