Designing a Feature Flag and Remote Config System

Feature flags and remote config are distinct concerns often merged into one system. Flags control feature visibility. Config controls runtime behavior. Conflating them causes operational confusion. This post covers how to design a system that handles both cleanly.

Context

Product teams need to gate features behind flags for gradual rollout, kill switches, and experimentation. Engineering teams need remote config for tuning timeouts, retry counts, and thresholds without app releases. Both require a system that resolves values quickly, caches aggressively, and fails safely.

Problem

Design a system that:

Evaluates feature flags and config values on the client with minimal latency
Supports targeting by user ID, device attributes, app version, and custom segments
Handles thousands of flags without bloating the payload
Provides safe defaults when the system is unreachable

Constraints

Constraint	Detail
Cold start	Flags must be available before the first screen renders
Staleness	Config can be up to 15 minutes stale; flags for experiments must be consistent within a session
Payload size	Full config payload must stay under 50KB compressed
Evaluation speed	Flag resolution must complete in under 5ms on-device
Consistency	A user must see the same flag value for the duration of a session

Design

Data Model

data class Flag(
    val key: String,
    val type: FlagType, // BOOLEAN, STRING, INT, JSON
    val defaultValue: Any,
    val rules: List<TargetingRule>,
    val rolloutPercentage: Int, // 0-100
    val enabled: Boolean
)
 
data class TargetingRule(
    val attribute: String, // "app_version", "country", "user_segment"
    val operator: Operator, // EQUALS, GREATER_THAN, IN, NOT_IN
    val value: Any
)

Evaluation Flow

App starts, loads cached config from disk (SharedPreferences or DataStore).
A background fetch requests the full config payload from the server.
On response, the new config is written to disk and swapped into the in-memory evaluator.
Flag evaluation: check enabled, evaluate targeting rules in order, apply rollout percentage using a deterministic hash of (userId + flagKey).

class FlagEvaluator(
    private val context: EvaluationContext // userId, deviceInfo, appVersion
) {
    fun evaluateBoolean(flag: Flag): Boolean {
        if (!flag.enabled) return flag.defaultValue as Boolean
 
        for (rule in flag.rules) {
            if (!rule.matches(context)) return flag.defaultValue as Boolean
        }
 
        val hash = murmurHash("${context.userId}:${flag.key}") % 100
        return hash < flag.rolloutPercentage
    }
}

Server Architecture

Client -> CDN (cached config JSON) -> Config Service -> Flag Store (PostgreSQL)
                                          |
                                    Admin Dashboard

The Config Service compiles all active flags into a single JSON payload, versioned with an ETag.
The CDN caches this payload with a 5-minute TTL.
Clients send If-None-Match headers to avoid downloading unchanged payloads.
The Admin Dashboard provides flag creation, targeting rule management, and audit logs.

Caching Strategy

Layer	TTL	Purpose
CDN	5 min	Reduce load on Config Service
In-memory (client)	Session lifetime	Fast evaluation, session consistency
Disk (client)	Until next successful fetch	Cold start fallback
Server response cache	1 min	Avoid recomputing payload per request

Session Consistency

Once flags are loaded for a session, they are pinned. Mid-session updates only take effect on the next app launch or explicit refresh. This prevents UI flicker and ensures experiment integrity.

class FlagStore private constructor() {
    private var sessionFlags: Map<String, Flag> = emptyMap()
    private var initialized = false
 
    fun initialize(flags: Map<String, Flag>) {
        if (!initialized) {
            sessionFlags = flags
            initialized = true
        }
    }
 
    fun getFlag(key: String): Flag? = sessionFlags[key]
 
    fun refreshForNextSession(flags: Map<String, Flag>) {
        // Write to disk; will be loaded on next cold start
        persistToDisk(flags)
    }
}

Trade-offs

Decision	Upside	Downside
Full payload download	Simple client logic, no per-flag API calls	Payload grows with flag count
CDN caching	Low latency, high availability	Propagation delay for urgent changes
Client-side evaluation	No network call per evaluation, works offline	Targeting logic must be shipped to client
Session pinning	Consistent UX, clean experiment data	Urgent kill switches delayed until next session
Deterministic hashing for rollout	Stable assignment, no server state needed	Cannot rebalance without changing the hash seed

Failure Modes

CDN outage: Client falls back to disk cache. If disk cache is empty (first install), all flags resolve to coded defaults.
Corrupt payload: Validate JSON schema before swapping into memory. Reject and retain previous cache on validation failure.
Hash collision clustering: Monitor rollout distribution. If a flag at 10% rollout shows 15% actual exposure, investigate hash function quality.
Stale kill switch: For critical kill switches, add a secondary fast path: a lightweight endpoint that returns only kill switch states, fetched every 60 seconds outside the CDN.

Scaling Considerations

At 1,000+ flags, switch from a flat JSON payload to a segmented approach: core flags in the main payload, rarely-used flags fetched on demand.
Use delta updates: send only changed flags since the client's last known version, reducing payload size by 80-90%.
For multi-platform consistency (Android, iOS, web), centralize evaluation logic in a shared rules engine or ensure identical hash implementations across platforms.

Observability

Log flag evaluation results (key, value, source: cache/network/default) to the analytics pipeline.
Track: config fetch success rate, payload size, evaluation latency p50/p95/p99, flag override count.
Alert on: flag evaluation falling back to defaults for more than 5% of sessions, payload size exceeding threshold, CDN hit rate dropping below 90%.

Key Takeaways

Separate feature flags from remote config conceptually, even if they share infrastructure.
Pin flag values per session for consistency. Mid-session changes cause subtle bugs.
Always have a coded default. The system must function when the config service is completely unreachable.
Use deterministic hashing for rollout percentages. Random assignment breaks experiment analysis.
Build a kill switch fast path that bypasses CDN caching for emergency shutoffs.

Final Thoughts

A feature flag system is a control plane for your product. Design it with the same care as your data plane. The moment you cannot trust your flags, you cannot trust your releases.