Designing Event Schemas That Survive Product Changes
How to design analytics event schemas that remain valid through product pivots, feature changes, and evolving business requirements without breaking downstream consumers.
Most event schemas are designed for the current product state. When the product changes (a screen is renamed, a flow is restructured, a field is repurposed), the schema breaks. Downstream queries, dashboards, and ML pipelines silently produce wrong results. This post covers how to design schemas that tolerate product evolution.
See also: Event Tracking System Design for Android Applications.
Context
Related: Engineering Principles I Apply Across Domains.
Event schemas define the structure of analytics data. They are consumed by data analysts, ML models, experimentation platforms, and automated alerting systems. A schema change that is backward-incompatible can break dozens of downstream dependencies, many of which are invisible to the engineering team making the change.
Problem
Design an event schema system that:
- Remains valid through product feature changes
- Supports schema evolution without breaking consumers
- Enables both forward and backward compatibility
- Provides clear contracts between producers and consumers
Constraints
| Constraint | Detail |
|---|---|
| Consumer count | 10-50 downstream consumers per major event type |
| Change frequency | Product changes weekly; schema should not need to change that often |
| Discovery | New team members must understand event semantics without tribal knowledge |
| Validation | Invalid events must be caught before reaching consumers |
| Versioning | Multiple schema versions may be in flight simultaneously (old app versions) |
Design
Schema Design Principles
1. Name events by user intent, not UI implementation.
| Bad | Good | Rationale |
|---|---|---|
red_button_clicked | purchase_initiated | Button color changes; intent does not |
screen_3_viewed | product_detail_viewed | Screen ordering changes; the concept persists |
new_modal_dismissed | upsell_dismissed | "New" is relative; the feature has a purpose |
2. Use semantic property names, not positional ones.
| Bad | Good |
|---|---|
value1, value2 | product_id, price_usd |
type: "A" | subscription_tier: "premium" |
flag: true | is_returning_user: true |
3. Separate stable identifiers from display values.
data class ProductViewedEvent(
val productId: String, // Stable identifier, never changes
val productName: String, // Display value, may change
val categoryId: String, // Stable
val categoryDisplayName: String // May change
)Downstream queries should join on productId, never on productName.
Schema Structure
EventSchema {
name: String // "purchase_completed"
version: Int // 3
description: String // Human-readable purpose
fields: List<FieldDefinition>
required_fields: List<String>
deprecated_fields: List<DeprecatedField>
}
FieldDefinition {
name: String
type: FieldType // STRING, INT, FLOAT, BOOLEAN, ARRAY, OBJECT
description: String
enum_values: List<String>? // Allowed values for STRING enums
added_in_version: Int
}
DeprecatedField {
name: String
deprecated_in_version: Int
replacement: String? // Field that replaces it
removal_version: Int? // Version when it will be removed
}
Evolution Rules
| Change Type | Compatibility | Action Required |
|---|---|---|
| Add optional field | Backward + forward compatible | No consumer changes needed |
| Add required field | Forward compatible only | Consumers on old version may break |
| Remove field | Breaking | Deprecate first, remove after migration |
| Rename field | Breaking | Add new field, deprecate old, keep both |
| Change field type | Breaking | Add new field with new type, deprecate old |
| Add enum value | Forward compatible | Old consumers may not handle new value |
| Remove enum value | Breaking | Deprecate, stop emitting, then remove |
Registry and Validation
EventSchemaRegistry {
schemas: Map<(event_name, version), EventSchema>
validate(event_name, version, payload):
schema = schemas[(event_name, version)]
if schema is null:
return UNKNOWN_SCHEMA
for field in schema.required_fields:
if field not in payload:
return MISSING_REQUIRED_FIELD(field)
for (key, value) in payload:
field_def = schema.fields.find(key)
if field_def is null:
return UNKNOWN_FIELD(key) // or WARN for forward compat
if not type_matches(value, field_def.type):
return TYPE_MISMATCH(key, expected=field_def.type, got=typeof(value))
return VALID
}
Client-Side Schema Enforcement
class SchemaEnforcedTracker(
private val registry: EventSchemaRegistry,
private val analytics: RawAnalyticsClient
) {
fun track(eventName: String, properties: Map<String, Any>) {
val schema = registry.getLatestSchema(eventName)
if (schema == null) {
logWarning("No schema found for event: $eventName")
return // Drop unregistered events
}
val validation = schema.validate(properties)
when (validation) {
is Valid -> analytics.track(eventName, properties + ("schema_version" to schema.version))
is MissingField -> {
logError("Missing required field ${validation.field} in $eventName")
// Send anyway with missing field flag for debugging
analytics.track(eventName, properties + ("_schema_error" to "missing:${validation.field}"))
}
is TypeMismatch -> {
logError("Type mismatch in $eventName: ${validation.detail}")
// Attempt coercion or drop
}
}
}
}Handling Product Changes
Scenario: A two-step checkout becomes a three-step checkout.
Old events: checkout_started, checkout_completed
New events: checkout_started, checkout_shipping_selected, checkout_completed
Resolution:
- Add the new event (
checkout_shipping_selected) as a new schema. No existing events change. - Add an optional field
checkout_step_counttocheckout_completed(value: 2 for old flow, 3 for new flow). - Downstream consumers that count checkout steps adapt to the new event without breaking.
Scenario: "Likes" are renamed to "Reactions" with multiple types.
Old event: post_liked with {post_id}
New event: post_reacted with {post_id, reaction_type}
Resolution:
- Introduce
post_reactedas a new event. - Deprecate
post_liked. Continue emitting it alongsidepost_reactedfor one release cycle. - The deprecated
post_likedevent carries a property_deprecated: trueand_replacement: "post_reacted". - Consumers migrate to
post_reacted. After migration, stop emittingpost_liked.
Trade-offs
| Decision | Upside | Downside |
|---|---|---|
| Intent-based naming | Survives UI changes | Requires upfront thought about user intent |
| Schema registry | Central validation, discovery | Registry is a dependency that can fail |
| Additive-only evolution | Never breaks consumers | Schemas accumulate deprecated fields |
| Dual-emit during migration | No consumer downtime | Temporary data duplication |
| Client-side validation | Catches errors at the source | Schema updates require app releases |
Failure Modes
- Schema registry unavailable: Client falls back to permissive mode (send without validation). Flag events for retroactive validation.
- Consumer depends on deprecated field: Consumer silently reads null/default values. Mitigation: log deprecation warnings in consumer pipelines, not just producer.
- Enum expansion breaks consumer: A new
reaction_typevalue causes a consumer's switch statement to hit the default case. Mitigation: consumers must always handle unknown enum values gracefully. - Schema version mismatch across platforms: Android sends v3, iOS sends v2 of the same event. Consumers must handle multiple versions simultaneously.
Scaling Considerations
- At 500+ event types with 5+ versions each, the schema registry becomes a critical service. Cache schemas aggressively on the client and in the pipeline.
- Automated schema compatibility checks in CI: reject PRs that introduce backward-incompatible changes without a migration plan.
- Generate client-side tracking code from schemas to prevent drift between schema definitions and actual event payloads.
Observability
- Track: schema validation failure rate by event type, deprecated field usage rate, schema version distribution across clients.
- Alert on: validation failure rate exceeding 1% for any event type, deprecated field usage not declining after deprecation announcement.
- Dashboard: event catalog showing all events, their versions, field definitions, and consumer list.
Key Takeaways
- Name events by user intent, not UI elements. UI changes constantly; user intent is stable.
- Separate stable identifiers from display values. Never let a downstream query depend on a display string.
- Evolve schemas additively. Adding fields is safe; removing or renaming is not.
- Dual-emit during transitions. Give consumers time to migrate before removing old events.
- Validate at the source. A schema error caught on the client is worth ten pipeline debugging sessions.
Further Reading
- Mobile Analytics Pipeline: From App Event to Dashboard: End-to-end design of a mobile analytics pipeline covering ingestion, processing, storage, and querying, with emphasis on reliability and ...
- Designing a Feature Flag and Remote Config System: Architecture and trade-offs for building a feature flag and remote configuration system that handles targeting, rollout, and consistency ...
- Designing an Experimentation Platform for Mobile Apps: System design for a mobile experimentation platform covering assignment, exposure tracking, metric collection, statistical analysis, and ...
Final Thoughts
Event schemas are the API contract between your product and your data consumers. Treat them with the same rigor as your public API. Version them, document them, validate them, and evolve them intentionally. The cost of a broken schema is not a failed build. It is weeks of decisions based on corrupted data.
Recommended
Designing an Offline-First Sync Engine for Mobile Apps
A deep dive into building a reliable sync engine that keeps mobile apps functional without connectivity, covering conflict resolution, queue management, and real-world trade-offs.
Jetpack Compose Recomposition: A Deep Dive
A detailed look at how Compose recomposition works under the hood, what triggers it, how the slot table tracks state, and how to control it in production apps.
Event Tracking System Design for Android Applications
A systems-level breakdown of designing an event tracking system for Android, covering batching, schema enforcement, local persistence, and delivery guarantees.