Designing Mobile Systems for Poor Network Conditions
Architecture patterns for mobile apps that function reliably on slow, intermittent, and lossy networks, covering request prioritization, adaptive quality, and graceful degradation.
Most mobile apps are developed and tested on fast WiFi. In production, a significant portion of users are on 2G, 3G, or congested networks with 500ms+ round-trip times and frequent drops. Designing for these conditions is not an edge case optimization. It is a baseline requirement.
Related: Designing a Simple Metrics Collection Service.
See also: Designing Systems That Degrade Gracefully.
Context
Network conditions vary dramatically across geographies, carriers, and even within a single user's session (entering a tunnel, switching from WiFi to cellular). A system designed only for ideal conditions delivers a broken experience for 20-40% of the global user base.
Problem
Design a mobile system architecture that:
- Remains functional on networks with 1-5 second RTTs
- Degrades gracefully when bandwidth drops below 50 Kbps
- Prioritizes critical data over non-essential content
- Minimizes user-perceived latency even when actual latency is high
Constraints
| Constraint | Detail |
|---|---|
| RTT range | 50ms (4G/WiFi) to 5000ms (2G/satellite) |
| Bandwidth | 10 Kbps (worst) to 100 Mbps (best) |
| Packet loss | Up to 10% on congested cellular networks |
| Battery | Poor signal strength increases radio power consumption 3-5x |
| Data cost | Metered connections are the norm in many markets |
Design
Network Quality Detection
class NetworkQualityMonitor(context: Context) {
private val connectivityManager =
context.getSystemService(Context.CONNECTIVITY_SERVICE) as ConnectivityManager
fun getQuality(): NetworkQuality {
val capabilities = connectivityManager.getNetworkCapabilities(
connectivityManager.activeNetwork
) ?: return NetworkQuality.NONE
val downstreamKbps = capabilities.linkDownstreamBandwidthKbps
return when {
downstreamKbps > 10_000 -> NetworkQuality.EXCELLENT
downstreamKbps > 2_000 -> NetworkQuality.GOOD
downstreamKbps > 500 -> NetworkQuality.MODERATE
downstreamKbps > 50 -> NetworkQuality.POOR
else -> NetworkQuality.TERRIBLE
}
}
// Supplement with empirical measurement
fun measureLatency(callback: (Long) -> Unit) {
val start = SystemClock.elapsedRealtime()
httpClient.newCall(Request.Builder().url(PING_URL).head().build())
.enqueue(object : Callback {
override fun onResponse(call: Call, response: Response) {
callback(SystemClock.elapsedRealtime() - start)
}
override fun onFailure(call: Call, e: IOException) {
callback(Long.MAX_VALUE)
}
})
}
}Request Prioritization
Not all requests are equal. On poor networks, only critical requests should proceed immediately:
| Priority | Examples | Behavior on Poor Network |
|---|---|---|
| Critical | Auth, payment confirmation | Proceed immediately, extended timeout |
| High | Primary content load, user actions | Proceed, standard timeout |
| Medium | Secondary content, prefetch | Defer until network improves |
| Low | Analytics, non-critical images | Queue for batch send on WiFi |
class PrioritizedRequestQueue(
private val networkMonitor: NetworkQualityMonitor,
private val dispatcher: CoroutineDispatcher = Dispatchers.IO
) {
private val criticalQueue = Channel<NetworkRequest>(Channel.UNLIMITED)
private val normalQueue = Channel<NetworkRequest>(Channel.UNLIMITED)
private val deferredQueue = Channel<NetworkRequest>(Channel.UNLIMITED)
suspend fun enqueue(request: NetworkRequest) {
when {
request.priority == Priority.CRITICAL -> criticalQueue.send(request)
networkMonitor.getQuality() >= NetworkQuality.MODERATE -> normalQueue.send(request)
request.priority == Priority.HIGH -> normalQueue.send(request)
else -> deferredQueue.send(request)
}
}
fun startProcessing() {
// Critical queue: always processed, higher concurrency
// Normal queue: processed with bounded concurrency
// Deferred queue: processed only when network quality is GOOD+
}
}Adaptive Content Quality
Serve different content quality based on network conditions:
class AdaptiveImageLoader(
private val networkMonitor: NetworkQualityMonitor
) {
fun getImageUrl(baseUrl: String): String {
val quality = when (networkMonitor.getQuality()) {
NetworkQuality.EXCELLENT -> "high"
NetworkQuality.GOOD -> "medium"
NetworkQuality.MODERATE -> "low"
else -> "thumbnail"
}
return "$baseUrl?quality=$quality"
}
}For API responses, use a quality parameter:
- Full quality: all fields, embedded objects, full text
- Reduced: essential fields only, IDs instead of embedded objects
- Minimal: summary data only, load details on demand
Optimistic UI
Show the expected result immediately, reconcile with the server response later:
class OptimisticActionHandler(
private val localDb: AppDatabase,
private val api: ApiService
) {
suspend fun likePost(postId: String) {
// 1. Update UI immediately via local DB
localDb.postDao().incrementLikeCount(postId)
localDb.postDao().setLiked(postId, true)
// 2. Send to server in background
try {
api.likePost(postId)
} catch (e: Exception) {
// 3. Rollback on failure
localDb.postDao().decrementLikeCount(postId)
localDb.postDao().setLiked(postId, false)
// Notify user of failure
}
}
}Prefetching and Caching
On good network conditions, preload data the user is likely to need:
class PrefetchManager(
private val networkMonitor: NetworkQualityMonitor,
private val cache: ContentCache
) {
fun maybePrefetch(predictedScreens: List<Screen>) {
if (networkMonitor.getQuality() < NetworkQuality.GOOD) return
if (networkMonitor.isMetered()) return
for (screen in predictedScreens) {
val data = screen.requiredData()
if (!cache.has(data.cacheKey)) {
fetchAndCache(data)
}
}
}
}Connection Pooling and Multiplexing
- Use HTTP/2 for connection multiplexing. Multiple requests over a single TCP connection reduces handshake overhead.
- Keep-alive connections reduce per-request latency by 200-500ms on slow networks.
- Use connection coalescing for requests to the same host.
Timeout Strategy
Static timeouts fail on variable networks. Use adaptive timeouts:
| Network Quality | Connect Timeout | Read Timeout | Total Timeout |
|---|---|---|---|
| Excellent | 5s | 10s | 15s |
| Good | 10s | 15s | 30s |
| Moderate | 15s | 30s | 45s |
| Poor | 20s | 45s | 60s |
Trade-offs
| Decision | Upside | Downside |
|---|---|---|
| Request prioritization | Critical paths unblocked | Deferred requests may never execute in long offline periods |
| Adaptive quality | Usable on any network | Inconsistent content quality across sessions |
| Optimistic UI | Instant perceived response | Rollback causes visible state changes |
| Aggressive prefetching | Data ready when needed | Wasted bandwidth if predictions are wrong |
| Adaptive timeouts | Fewer false timeout errors | Longer waits on degraded networks |
Failure Modes
- Network quality oscillation: Rapid switching between quality levels causes request priority flapping. Mitigation: use a smoothing window (average quality over the last 30 seconds).
- Optimistic UI rollback cascade: A failed action triggers a chain of rollbacks in dependent UI elements. Mitigation: design optimistic actions to be independent, or batch dependent actions.
- Stale prefetched data: Prefetched data becomes outdated before use. Mitigation: attach TTLs to prefetched content, revalidate on display.
- Timeout too long on poor networks: User stares at a spinner for 60 seconds. Mitigation: show cached/stale data immediately with a "refreshing" indicator.
Scaling Considerations
- Adaptive quality requires the backend to support multiple response formats. Use content negotiation headers to avoid separate endpoints.
- Prefetch predictions can be powered by analytics data (most common navigation paths), reducing wasted prefetch bandwidth.
- For global apps, deploy edge servers in regions with poor infrastructure to reduce RTT.
Observability
- Track: request success rate by network quality tier, p50/p95 latency by quality tier, deferred request completion rate, optimistic UI rollback rate.
- Alert on: success rate dropping below 90% for any quality tier, rollback rate exceeding 5%.
- Segment all metrics by network quality. Averages across all users hide poor-network issues.
Key Takeaways
- Detect network quality continuously, not just at app start. Conditions change mid-session.
- Prioritize requests ruthlessly. On poor networks, only critical and high-priority requests should fire.
- Use optimistic UI for user actions. Perceived speed matters more than actual speed.
- Prefetch on good networks, serve from cache on poor ones. Shift work to when conditions are favorable.
- Every timeout and retry parameter should be adaptive. Static values optimize for one network condition and fail on all others.
Further Reading
- Designing Retry and Backoff Strategies for Mobile Networks: A detailed look at retry strategies for mobile clients, covering exponential backoff, jitter, circuit breakers, and adaptive retry polici...
- Designing Background Job Systems for Mobile Apps: Architecture for reliable background job execution on Android, covering WorkManager, job prioritization, constraint handling, and failure...
- Designing Idempotent APIs for Mobile Clients: How to design APIs that handle duplicate requests safely, covering idempotency keys, server-side deduplication, and failure scenarios spe...
Final Thoughts
Designing for poor networks is not about handling failure gracefully. It is about delivering a functional product when the network is working against you. The techniques in this post are not optimizations. They are the difference between an app that works globally and one that works only in cities with 5G coverage.
Recommended
Designing an Offline-First Sync Engine for Mobile Apps
A deep dive into building a reliable sync engine that keeps mobile apps functional without connectivity, covering conflict resolution, queue management, and real-world trade-offs.
Jetpack Compose Recomposition: A Deep Dive
A detailed look at how Compose recomposition works under the hood, what triggers it, how the slot table tracks state, and how to control it in production apps.
Event Tracking System Design for Android Applications
A systems-level breakdown of designing an event tracking system for Android, covering batching, schema enforcement, local persistence, and delivery guarantees.