The Hyperscale Lottery: How State-Space Models Have Sacrificed Edge Efficiency
2026-04-09 • Hardware Architecture
Hardware Architecture
AI summaryⓘ
The authors discuss how the types of computer hardware available influence the direction of AI research, calling this the Hardware Lottery. They introduce the idea of a Hyperscale Lottery, where AI models are designed mainly to run well on big cloud computers, which can hurt their speed and efficiency on smaller devices like phones. They show that newer versions of a model called Mamba, though good for large cloud systems, run slower on edge devices compared to earlier versions. The authors suggest that designing AI models should separate strategies for big cloud systems from those intended for smaller, real-time devices.
Hardware LotteryHyperscale LotteryEdge IntelligenceState-Space ModelsMambaCloud ThroughputAlgorithmic EfficiencyLatencyHyperscale GPUsReal-time Computing
Authors
Robin Geens, Jonas De Schouwer, Marian Verhelst, Thierry Tambe
Abstract
The Hardware Lottery posits that research directions are dictated by available silicon compute platforms. We identify a derivative phenomenon, the Hyperscale Lottery, where model architectures are optimized for cloud throughput at the expense of algorithmic efficiency. While State-Space Models (SSMs) such as Mamba were lauded for their linear complexity, ideal for edge intelligence, their evolution from Mamba-1 to Mamba-3 reveals a systematic divergence from edge-native efficiency. We demonstrate that Mamba-3's architectural changes, designed to saturate hyperscale GPUs, impose a significant edge penalty: a 28% latency increase at 880M parameters, worsening to 48% for 15M-parameter models. We argue for decoupling cloud-scale saturation strategies from core architectural design to preserve the viability of single-user, real-time edge intelligence.