Designing APIs With Mobile Constraints in Mind
How to design backend APIs that account for mobile-specific constraints: bandwidth, latency, battery, intermittent connectivity, and long-lived client versions.
APIs designed for web frontends often perform poorly on mobile. They assume fast, reliable connections, unlimited bandwidth, and clients that can be updated instantly. None of these hold for mobile. This post covers how to design APIs that work well under mobile constraints from the start.
Context
Mobile clients differ from web clients in fundamental ways: they operate on variable-quality networks, they consume battery with every network request, they cannot be force-updated, and they have limited memory for caching large responses. An API that ignores these constraints forces the mobile team to build complex workarounds on the client.
Problem
Design APIs that:
- Minimize bandwidth usage and request count
- Tolerate high latency and intermittent connectivity
- Support long-lived client versions without breaking changes
- Return data in shapes that match mobile UI requirements
Constraints
| Constraint | Detail |
|---|---|
| Bandwidth | Metered connections, 50 Kbps to 100 Mbps range |
| Latency | 50ms (WiFi) to 5000ms (2G) RTT |
| Battery | Each request activates the radio, costing battery even for small payloads |
| Client versions | 20+ app versions in production simultaneously |
| Memory | Response must be parseable without loading entire payload into memory on low-end devices |
Design
Response Design: Aggregate, Do Not Scatter
The single worst pattern for mobile is requiring multiple sequential API calls to render one screen.
Bad: N+1 calls to render a profile
GET /users/123 -> user data
GET /users/123/posts -> post list
GET /users/123/followers -> follower count
GET /users/123/badges -> achievement badges
Good: Single aggregated endpoint
GET /users/123/profile
{
"user": { ... },
"recent_posts": [ ... ],
"follower_count": 1234,
"badges": [ ... ]
}
This is the Backend for Frontend (BFF) pattern. A mobile-specific BFF aggregates data from multiple services into a single response shaped for the screen.
Field Selection
Let clients request only the fields they need:
GET /users/123/profile?fields=user.name,user.avatar_url,follower_count
This reduces payload size by 50-80% for screens that only need a subset of the data.
field_selection(full_response, requested_fields):
if requested_fields is empty:
return full_response // Default: return everything
filtered = {}
for field_path in requested_fields:
value = extract_nested(full_response, field_path)
set_nested(filtered, field_path, value)
return filtered
Pagination
Mobile pagination must be cursor-based, not offset-based:
| Approach | Problem on Mobile |
|---|---|
Offset (?page=3&limit=20) | Insertions cause duplicates or skipped items when paginating |
Cursor (?after=abc123&limit=20) | Stable regardless of insertions or deletions |
GET /feed?after=cursor_abc123&limit=20
Response:
{
"items": [...],
"cursors": {
"next": "cursor_def456",
"has_more": true
}
}
Compression and Payload Optimization
| Technique | Bandwidth Savings | Trade-off |
|---|---|---|
| Gzip response compression | 60-80% | CPU on client for decompression |
JSON field shortening (e.g., fn instead of first_name) | 10-20% | Readability loss, debugging difficulty |
| Protocol Buffers | 30-50% over JSON | Schema management, less human-readable |
| Delta responses (send only changes) | 70-90% for updates | Complex client-side merge logic |
Recommendation: Use gzip compression on all responses (essentially free). Consider Protocol Buffers for high-frequency endpoints. Avoid field shortening; the readability cost outweighs the bandwidth savings.
Conditional Requests
Avoid re-downloading unchanged data:
Client sends: GET /feed
If-None-Match: "etag_abc123"
Server responds: 304 Not Modified (empty body)
For collections that change partially:
Client sends: GET /feed
If-Modified-Since: 2025-10-26T12:00:00Z
Server responds: 200 OK
{
"items": [/* only items modified since the given timestamp */],
"deleted_ids": ["id1", "id2"],
"last_modified": "2025-10-27T08:30:00Z"
}
Error Response Design
Mobile error responses must be actionable by the client, not just human-readable:
{
"error": {
"code": "INSUFFICIENT_FUNDS",
"message": "Your account balance is too low for this purchase.",
"details": {
"required_amount": 29.99,
"available_amount": 15.50,
"currency": "USD"
},
"actions": [
{
"type": "deep_link",
"label": "Add funds",
"uri": "app://wallet/add-funds"
}
],
"retry_eligible": false
}
}The code field enables client-side branching. The actions field allows the server to guide the user without requiring a client update.
Request Batching
For clients that need to make many small requests, provide a batch endpoint:
POST /batch
{
"requests": [
{"method": "GET", "path": "/users/123"},
{"method": "GET", "path": "/users/456"},
{"method": "GET", "path": "/products/789"}
]
}
Response:
{
"responses": [
{"status": 200, "body": {...}},
{"status": 200, "body": {...}},
{"status": 404, "body": {...}}
]
}
This turns N network round trips into one, saving battery and reducing total latency.
Timeouts and Deadlines
APIs should accept a client-specified deadline:
GET /search?q=shoes
X-Request-Deadline: 3000 // milliseconds
The server uses this to short-circuit expensive operations. If the search cannot complete in 3 seconds, return partial results rather than timing out.
Related: Comparing Search Implementations: Client vs Server.
handle_search(query, deadline_ms):
start = now()
results = []
for shard in search_shards:
if (now() - start) > deadline_ms * 0.8:
break // Return what we have
results.extend(shard.search(query))
return SearchResponse(
results = results,
is_partial = len(results) < expected_total,
total_available = expected_total
)
Trade-offs
| Decision | Upside | Downside |
|---|---|---|
| BFF pattern | Optimized responses per screen | Another service to maintain |
| Field selection | Smaller payloads | Caching complexity (different field sets = different cache keys) |
| Cursor pagination | Stable under concurrent writes | Cannot jump to arbitrary page |
| Batch endpoint | Fewer round trips | Complex error handling per sub-request |
| Client deadline header | Server respects client timeout | Server must support partial responses |
Failure Modes
- BFF becomes a bottleneck: All mobile traffic funnels through one service. Mitigation: deploy BFF per feature team, not as a monolith.
- Field selection cache miss storm: Every unique field combination is a separate cache entry. Mitigation: define standard field sets ("minimal", "standard", "full") and cache those.
- Batch request partial failure: 2 of 3 sub-requests succeed. Client must handle per-request status codes within the batch response.
- Stale ETag after server-side data migration: ETags computed from old data format invalidate all client caches simultaneously. Mitigation: version ETags separately from data format.
- Overly aggressive compression: A highly compressed response on a low-end device causes high CPU usage during decompression. Mitigation: use gzip level 6 (balanced), not level 9 (maximum).
See also: Building a Minimal Feature Flag Service.
Scaling Considerations
- BFF services should be stateless and horizontally scalable. They are pure aggregation layers.
- Cache aggregated responses at the BFF level with short TTLs (30-60 seconds). This absorbs repeated requests for the same screen.
- For global deployment, BFF instances should be co-located with backend services to minimize inter-service latency.
Observability
- Track: response size distribution, cache hit rate for conditional requests (304 vs 200), field selection usage patterns, batch endpoint usage, client deadline misses.
- Alert on: average response size exceeding target per endpoint, 304 rate dropping (indicates caching issues), batch endpoint error rate.
- Log: field selection patterns to identify common sets worth standardizing.
Key Takeaways
- Design one-request-per-screen endpoints. Multiple sequential calls are the primary cause of slow mobile experiences.
- Support conditional requests (ETags, If-Modified-Since) on every endpoint that returns data the client caches.
- Use cursor-based pagination. Offset pagination breaks under concurrent writes, which are common in feed-style UIs.
- Make error responses machine-actionable. Include error codes, structured details, and suggested actions.
- Accept client deadlines. A partial response in 3 seconds is more useful than a complete response in 15 seconds.
Further Reading
- Designing Rate Limiting for Mobile APIs: Rate limiting strategies for APIs consumed by mobile clients, covering token bucket algorithms, client identification, degradation modes,...
- Designing Idempotent APIs for Mobile Clients: How to design APIs that handle duplicate requests safely, covering idempotency keys, server-side deduplication, and failure scenarios spe...
- Comparing REST vs GraphQL for Mobile Clients: Measured payload sizes, request counts, latency, and battery impact of REST vs GraphQL APIs serving a mobile application with varying net...
Final Thoughts
A mobile-friendly API is not a dumbed-down API. It is an API that respects the physical constraints of the device and network it serves. Every byte over the wire costs battery and data plan. Every extra round trip costs seconds the user will not wait. Design accordingly.
Recommended
Designing an Offline-First Sync Engine for Mobile Apps
A deep dive into building a reliable sync engine that keeps mobile apps functional without connectivity, covering conflict resolution, queue management, and real-world trade-offs.
Jetpack Compose Recomposition: A Deep Dive
A detailed look at how Compose recomposition works under the hood, what triggers it, how the slot table tracks state, and how to control it in production apps.
Event Tracking System Design for Android Applications
A systems-level breakdown of designing an event tracking system for Android, covering batching, schema enforcement, local persistence, and delivery guarantees.