Implementing Server-Side Rendering Without Overhead

Dhruval Dhameliya·August 16, 2025·8 min read

Techniques for reducing SSR latency including streaming, selective hydration, component-level caching, and measured performance gains.

Context

A React application running on Next.js had SSR response times averaging 450ms at p50 and 1,200ms at p95. The pages rendered complex product detail views with nested components, multiple data sources, and dynamic pricing. The goal was to reduce SSR latency to under 100ms at p50 without sacrificing dynamic content.

Problem

SSR latency comes from three sources: data fetching, React rendering, and serialization. Most optimization guides focus on data fetching (add caching, parallelize queries). But rendering and serialization often dominate for component-heavy pages. I needed to address all three.

Constraints

  • Framework: Next.js 14 with App Router
  • Page complexity: 45 React components per page, 12 data fetching calls
  • Target: p50 TTFB under 100ms, p95 under 300ms
  • Dynamic content: user-specific pricing, real-time inventory
  • Must maintain SEO (full HTML in initial response)
  • No CDN caching for personalized pages
  • Deployment: Vercel serverless functions

Design

Baseline Measurement

Before optimization, I profiled the SSR pipeline:

PhaseDuration (p50)% of Total
Data fetching (sequential)280ms62%
React rendering120ms27%
HTML serialization35ms8%
Response overhead15ms3%
Total450ms100%

Optimization 1: Parallel Data Fetching

The 12 data fetching calls were executed sequentially in getServerSideProps. Reorganizing them into parallel groups:

const [product, pricing, inventory, reviews, related, promotions] =
  await Promise.all([
    fetchProduct(slug),
    fetchPricing(slug, userId),
    fetchInventory(slug),
    fetchReviews(slug, { limit: 5 }),
    fetchRelatedProducts(slug, { limit: 4 }),
    fetchActivePromotions(slug),
  ]);

Some calls had dependencies (pricing depends on product ID), so not all 12 could be parallelized. The dependency graph allowed 3 parallel groups:

GroupCallsDuration
Group 1product, categories, site config45ms (max of group)
Group 2pricing, inventory, promotions (need product ID)38ms
Group 3reviews, related, recommendations (need product + user)42ms
Total12 calls125ms

Data fetching dropped from 280ms to 125ms.

Optimization 2: Component-Level Caching

Not all components need fresh data on every request. The site header, footer, navigation, and category sidebar are identical for all users. I cached their rendered HTML output:

import { cache } from 'react';
 
const getCachedNavigation = cache(async () => {
  const nav = await fetchNavigation();
  return renderToString(<Navigation items={nav} />);
});

For server components in Next.js App Router, the built-in fetch cache handles this:

async function Navigation() {
  const nav = await fetch('/api/navigation', {
    next: { revalidate: 300 },
  });
  return <nav>...</nav>;
}

This removed 8 components from the per-request render tree, reducing rendering time from 120ms to 65ms.

Optimization 3: Streaming SSR

Next.js App Router supports React streaming. Instead of waiting for the entire page to render before sending any bytes, the server streams the shell immediately and fills in dynamic sections as they resolve:

// layout.tsx - sent immediately
export default function Layout({ children }) {
  return (
    <html>
      <body>
        <Header />  {/* Cached, instant */}
        <Suspense fallback={<ProductSkeleton />}>
          {children}  {/* Streamed when ready */}
        </Suspense>
        <Footer />  {/* Cached, instant */}
      </body>
    </html>
  );
}

With streaming, TTFB (time to first byte) dropped to 35ms because the cached shell sends immediately. The full page completes at 130ms, but the user sees content at 35ms.

Optimization 4: Selective Hydration

Not all components need client-side interactivity. Product descriptions, specifications, and reviews are read-only. Marking them as server components eliminates their hydration JavaScript:

ComponentHydration NeededJS Bundle Impact
ProductImages (carousel)Yes18KB
ProductDescriptionNo (server component)-12KB saved
ProductSpecsNo (server component)-4KB saved
ReviewsNo (server component)-8KB saved
AddToCartYes6KB
PricingDisplayYes (dynamic)3KB

Total JS reduction: 24KB (from 51KB to 27KB for this page).

Results After All Optimizations

PhaseBeforeAfterImprovement
Data fetching280ms125ms55%
React rendering120ms65ms46%
HTML serialization35ms20ms (smaller tree)43%
TTFB (streaming)450ms35ms92%
Full page complete450ms130ms71%

Trade-offs

OptimizationBenefitCost
Parallel fetching55% data fetch reductionIncreased code complexity, error handling for partial failures
Component caching46% render reductionCache invalidation complexity, stale navigation risk
Streaming92% TTFB reductionLayout shift risk if suspense boundaries are poorly placed
Selective hydration47% JS reductionCannot add interactivity to server components later without refactoring

The highest-impact change was streaming (92% TTFB improvement). The lowest-effort change was parallel fetching (a few lines of code). Component caching provided the best sustained throughput improvement but required careful cache invalidation.

Related: Failure Modes I Actively Design For.

Failure Modes

Streaming with error boundaries: If a streamed component throws during rendering, the error propagates to the nearest error boundary. If no error boundary exists, the entire stream fails. Unlike non-streaming SSR, there is no opportunity to retry the full page. Mitigation: wrap every <Suspense> boundary with an error boundary that renders a fallback UI.

Component cache poisoning: If a cached component inadvertently includes user-specific data (a logged-in username in the navigation), that data leaks to all subsequent users. Mitigation: strict separation between cached (shared) and uncached (personalized) components. Review cached components for any dependency on request context.

Hydration mismatch: If the server-rendered HTML differs from the client render (common with time-dependent content, locale differences, or feature flags), React logs a warning and re-renders, negating the SSR benefit. Mitigation: ensure deterministic rendering by passing all dynamic values as props from the server.

Streaming and SEO: Search engine crawlers may not wait for streamed content to complete. Critical SEO content (title, description, structured data) must be in the non-streamed shell, not behind a Suspense boundary.

See also: Comparing Search Implementations: Client vs Server.

Scaling Considerations

  • Component caching reduces per-request CPU by 40-50%. At 1,000 req/min, this is the difference between 2 and 4 serverless function instances.
  • Streaming allows the CDN to start sending bytes to the client before the server finishes rendering. This improves perceived performance but does not reduce server compute time.
  • For pages with many independent data sources, consider micro-frontends or partial SSR where only the dynamic section is server-rendered and the rest is static.
  • Monitor the ratio of streaming time to total page time. If streaming content takes more than 3 seconds to complete, users may see too many skeleton states.

Observability

  • Server-Timing header with per-phase durations (fetch, render, serialize)
  • Track TTFB separately from full page load time (streaming makes these diverge significantly)
  • Monitor component cache hit rates and staleness
  • Log hydration mismatches in production (React warnings are often silenced)
  • Measure Largest Contentful Paint (LCP) as the user-facing metric, not TTFB

Key Takeaways

  • Parallel data fetching is the single highest-ROI optimization. Most SSR slowness is sequential data fetching in disguise.
  • Component-level caching eliminates redundant rendering for shared UI elements. Identify components that are identical across requests and cache their output.
  • Streaming SSR transforms TTFB from "time to render everything" to "time to render the shell." This is a paradigm shift for perceived performance.
  • Selective hydration reduces client-side JavaScript. Mark read-only components as server components to avoid shipping unnecessary code.
  • Measure each phase independently. Optimizing rendering when data fetching is the bottleneck wastes effort.

Further Reading

Final Thoughts

The final result, 35ms TTFB with 130ms full completion, was achieved through four independent optimizations that each addressed a different phase of the SSR pipeline. No single optimization was sufficient. The combination reduced TTFB by 92% and full render time by 71%. The key insight is that SSR performance is a pipeline problem, and the pipeline is only as fast as its slowest sequential stage.

Recommended