LOLLA: Deep Reinforcement Learning for Closed-Loop Link Adaptation Towards a GPU-Accelerated AI-RAN
2026-06-22 • Machine Learning
Machine LearningNetworking and Internet Architecture
AI summaryⓘ
The authors improved a method called outer-loop link adaptation (OLLA), used in 5G networks to adjust data transmission based on changing signal quality. They created LOLLA, which uses deep reinforcement learning to make smarter adjustments by using more detailed signal information, instead of the simple feedback OLLA relies on. Their approach maintains compatibility with existing standards, learns to keep error rates within target limits, and runs very quickly on specialized hardware. Tests showed that LOLLA significantly increases data throughput across a range of conditions and works well with multiple users and different channel environments.
5G NROuter-loop Link Adaptation (OLLA)Deep Reinforcement LearningSignal-to-Interference-plus-Noise Ratio (SINR)Modulation and Coding Scheme (MCS)Proximal Policy Optimization (PPO)Block Error Rate (BLER)Doppler FrequencyPHY/MAC TelemetryGPU-accelerated
Authors
Rui Wang, Linchao Zhang, Qiang Liu, Kun Yang
Abstract
Outer-loop link adaptation (OLLA) is widely deployed in 5G NR to track channel variations, yet its reliance on first-order, single-bit feedback degrades performance significantly under high-mobility and fast-varying channels. This paper presents LOLLA (Learned Outer-Loop Link Adaptation), a deep reinforcement learning framework that replaces the conventional OLLA staircase with a learned, continuous SINR offset conditioned on rich PHY/MAC telemetry inaccessible to OLLA. The offset modulates the SINR-to-MCS lookup table, preserving 3GPP-compliant MCS selection and provably subsuming the conventional OLLA update rule. A Proximal Policy Optimization (PPO) policy trained under a Lagrangian block error rate (BLER) constraint automatically enforces tunable reliability targets from 1% to 15% without manual penalty calibration. The framework is realized as the first closed-loop AI-native control dApp on a GPU-accelerated 5G NR stack, achieving end-to-end control latencies under 500 microseconds. Evaluations under 3GPP TDL channel models demonstrate 15% to 92% throughput gains over OLLA across Doppler frequencies up to 400 Hz, while attaining a Pareto frontier that strictly dominates OLLA across all evaluated reliability targets. The learned policy generalizes to unseen channel models and scales to eight concurrent UEs under shared-resource scheduling. In the uplink formulation, the gNB directly observes decoding outcomes, enabling simulation-to-deployment parity.