Designing APIs With Mobile Constraints in Mind

APIs designed for web frontends often perform poorly on mobile. They assume fast, reliable connections, unlimited bandwidth, and clients that can be updated instantly. None of these hold for mobile. This post covers how to design APIs that work well under mobile constraints from the start.

Context

Mobile clients differ from web clients in fundamental ways: they operate on variable-quality networks, they consume battery with every network request, they cannot be force-updated, and they have limited memory for caching large responses. An API that ignores these constraints forces the mobile team to build complex workarounds on the client.

Problem

Design APIs that:

Minimize bandwidth usage and request count
Tolerate high latency and intermittent connectivity
Support long-lived client versions without breaking changes
Return data in shapes that match mobile UI requirements

Constraints

Constraint	Detail
Bandwidth	Metered connections, 50 Kbps to 100 Mbps range
Latency	50ms (WiFi) to 5000ms (2G) RTT
Battery	Each request activates the radio, costing battery even for small payloads
Client versions	20+ app versions in production simultaneously
Memory	Response must be parseable without loading entire payload into memory on low-end devices

Design

Response Design: Aggregate, Do Not Scatter

The single worst pattern for mobile is requiring multiple sequential API calls to render one screen.

Bad: N+1 calls to render a profile

GET /users/123           -> user data
GET /users/123/posts     -> post list
GET /users/123/followers -> follower count
GET /users/123/badges    -> achievement badges

Good: Single aggregated endpoint

GET /users/123/profile

{
    "user": { ... },
    "recent_posts": [ ... ],
    "follower_count": 1234,
    "badges": [ ... ]
}

This is the Backend for Frontend (BFF) pattern. A mobile-specific BFF aggregates data from multiple services into a single response shaped for the screen.

Field Selection

Let clients request only the fields they need:

GET /users/123/profile?fields=user.name,user.avatar_url,follower_count

This reduces payload size by 50-80% for screens that only need a subset of the data.

field_selection(full_response, requested_fields):
    if requested_fields is empty:
        return full_response  // Default: return everything

    filtered = {}
    for field_path in requested_fields:
        value = extract_nested(full_response, field_path)
        set_nested(filtered, field_path, value)
    return filtered

Pagination

Mobile pagination must be cursor-based, not offset-based:

Approach	Problem on Mobile
Offset (`?page=3&limit=20`)	Insertions cause duplicates or skipped items when paginating
Cursor (`?after=abc123&limit=20`)	Stable regardless of insertions or deletions

GET /feed?after=cursor_abc123&limit=20

Response:
{
    "items": [...],
    "cursors": {
        "next": "cursor_def456",
        "has_more": true
    }
}

Compression and Payload Optimization

Technique	Bandwidth Savings	Trade-off
Gzip response compression	60-80%	CPU on client for decompression
JSON field shortening (e.g., `fn` instead of `first_name`)	10-20%	Readability loss, debugging difficulty
Protocol Buffers	30-50% over JSON	Schema management, less human-readable
Delta responses (send only changes)	70-90% for updates	Complex client-side merge logic

Recommendation: Use gzip compression on all responses (essentially free). Consider Protocol Buffers for high-frequency endpoints. Avoid field shortening; the readability cost outweighs the bandwidth savings.

Conditional Requests

Avoid re-downloading unchanged data:

Client sends: GET /feed
              If-None-Match: "etag_abc123"

Server responds: 304 Not Modified (empty body)

For collections that change partially:

Client sends: GET /feed
              If-Modified-Since: 2025-10-26T12:00:00Z

Server responds: 200 OK
{
    "items": [/* only items modified since the given timestamp */],
    "deleted_ids": ["id1", "id2"],
    "last_modified": "2025-10-27T08:30:00Z"
}

Error Response Design

Mobile error responses must be actionable by the client, not just human-readable:

{
    "error": {
        "code": "INSUFFICIENT_FUNDS",
        "message": "Your account balance is too low for this purchase.",
        "details": {
            "required_amount": 29.99,
            "available_amount": 15.50,
            "currency": "USD"
        },
        "actions": [
            {
                "type": "deep_link",
                "label": "Add funds",
                "uri": "app://wallet/add-funds"
            }
        ],
        "retry_eligible": false
    }
}

The code field enables client-side branching. The actions field allows the server to guide the user without requiring a client update.

Request Batching

For clients that need to make many small requests, provide a batch endpoint:

POST /batch

{
    "requests": [
        {"method": "GET", "path": "/users/123"},
        {"method": "GET", "path": "/users/456"},
        {"method": "GET", "path": "/products/789"}
    ]
}

Response:
{
    "responses": [
        {"status": 200, "body": {...}},
        {"status": 200, "body": {...}},
        {"status": 404, "body": {...}}
    ]
}

This turns N network round trips into one, saving battery and reducing total latency.

Timeouts and Deadlines

APIs should accept a client-specified deadline:

GET /search?q=shoes
X-Request-Deadline: 3000  // milliseconds

The server uses this to short-circuit expensive operations. If the search cannot complete in 3 seconds, return partial results rather than timing out.

handle_search(query, deadline_ms):
    start = now()
    results = []

    for shard in search_shards:
        if (now() - start) > deadline_ms * 0.8:
            break  // Return what we have
        results.extend(shard.search(query))

    return SearchResponse(
        results = results,
        is_partial = len(results) < expected_total,
        total_available = expected_total
    )

Trade-offs

Decision	Upside	Downside
BFF pattern	Optimized responses per screen	Another service to maintain
Field selection	Smaller payloads	Caching complexity (different field sets = different cache keys)
Cursor pagination	Stable under concurrent writes	Cannot jump to arbitrary page
Batch endpoint	Fewer round trips	Complex error handling per sub-request
Client deadline header	Server respects client timeout	Server must support partial responses

Failure Modes

BFF becomes a bottleneck: All mobile traffic funnels through one service. Mitigation: deploy BFF per feature team, not as a monolith.
Field selection cache miss storm: Every unique field combination is a separate cache entry. Mitigation: define standard field sets ("minimal", "standard", "full") and cache those.
Batch request partial failure: 2 of 3 sub-requests succeed. Client must handle per-request status codes within the batch response.
Stale ETag after server-side data migration: ETags computed from old data format invalidate all client caches simultaneously. Mitigation: version ETags separately from data format.
Overly aggressive compression: A highly compressed response on a low-end device causes high CPU usage during decompression. Mitigation: use gzip level 6 (balanced), not level 9 (maximum).

Scaling Considerations

BFF services should be stateless and horizontally scalable. They are pure aggregation layers.
Cache aggregated responses at the BFF level with short TTLs (30-60 seconds). This absorbs repeated requests for the same screen.
For global deployment, BFF instances should be co-located with backend services to minimize inter-service latency.

Observability

Track: response size distribution, cache hit rate for conditional requests (304 vs 200), field selection usage patterns, batch endpoint usage, client deadline misses.
Alert on: average response size exceeding target per endpoint, 304 rate dropping (indicates caching issues), batch endpoint error rate.
Log: field selection patterns to identify common sets worth standardizing.

Key Takeaways

Design one-request-per-screen endpoints. Multiple sequential calls are the primary cause of slow mobile experiences.
Support conditional requests (ETags, If-Modified-Since) on every endpoint that returns data the client caches.
Use cursor-based pagination. Offset pagination breaks under concurrent writes, which are common in feed-style UIs.
Make error responses machine-actionable. Include error codes, structured details, and suggested actions.
Accept client deadlines. A partial response in 3 seconds is more useful than a complete response in 15 seconds.

Final Thoughts

A mobile-friendly API is not a dumbed-down API. It is an API that respects the physical constraints of the device and network it serves. Every byte over the wire costs battery and data plan. Every extra round trip costs seconds the user will not wait. Design accordingly.