Throughput Optimization for Multi-AP IEEE P802.11bq Networks Based on Combinatorial Multi-Armed Bandits

2026-06-02Networking and Internet Architecture

Networking and Internet Architecture
AI summary

The authors study how to improve data throughput in networks with many access points using the IEEE P802.11bq standard, which involves complex wireless communication factors like directional beams and interference. They create a detailed model of how different settings affect performance and formulate the optimization as a special type of problem called a combinatorial multi-armed bandit. Their proposed learning method efficiently explores configuration options and outperforms a common baseline by improving network speed and stabilization time in simulations. Their results show that achieving high throughput requires balancing various network parameters, not just focusing on avoiding collisions or maximizing signal rates.

IEEE P802.11bqCSMA/CARTS/CTSBeam-trainingDirectional mmWave interferenceSINRMCS selectionCombinatorial Multi-Armed BanditThompson samplingClear-channel assessment
Authors
Anshan Yuan, Mingqi Han, Xinghua Sun
Abstract
This paper addresses distributed throughput optimization for dense multi-AP IEEE P802.11bq networks. We develop a packet-level model that jointly captures cross-link carrier-sense multiple access with collision avoidance (CSMA/CA), sub-7GHz RTS/CTS exchange, beam-training overhead, directional mmWave interference, signal-to-interference-plus-noise-ratio (SINR)-based MCS selection, and retransmissions. The resulting configuration problem is formulated as a multi-group combinatorial multi-armed bandit (CMAB), where each AP selects its contention window, clear-channel assessment threshold, beamwidth, and MCS reservation margin from finite candidate sets. Inspired by combinatorial successive accept-reject methods, we propose a group-wise feasible CSAR variant that uses Hadamard-guided feasible exploration to estimate empirical ranking scores and eliminate low-performing candidates within each parameter group. Simulations show that the proposed scheme improves aggregate and per-AP throughput over the considered Thompson-sampling baseline across most AP densities and reduces throughput stabilization time by approximately 49$\%$ under the evaluated settings. The learned configurations reveal that high throughput requires a balance among control-channel aggressiveness, mmWave spatial reuse, beam-training cost, and MCS robustness, rather than simply minimizing collisions or maximizing the PHY rate.