Real-Time Parallel Counterfactual Regret Minimization

2026-05-19Computer Science and Game Theory

Computer Science and Game TheoryArtificial IntelligenceMachine Learning
AI summary

AI summary is being generated…

Authors
Boning Li, Longbo Huang
Abstract
Counterfactual Regret Minimization (CFR) is the dominant algorithmic family for solving large imperfect-information games, underpinning breakthroughs such as Libratus and Pluribus in No-Limit Texas Hold'em poker. In real-time game-playing systems, the solver must compute a near-equilibrium strategy within a strict time budget of only a few seconds per decision, and the number of CFR iterations completed in this window directly determines play strength. We present \textbf{Parallel CFR}, the first parallelization framework for real-time depth-limited CFR solving that seamlessly integrates pruning, abstraction, and advanced CFR variants. We decompose each CFR iteration into a pipeline of seven stages and identify two orthogonal dimensions of parallelism: \emph{by information set} and \emph{by tree node}. Leaf node evaluation is offloaded to GPUs via batched neural network inference, creating a heterogeneous CPU--GPU pipeline. Experiments on Heads-Up No-Limit Texas Hold'em demonstrate that Parallel CFR achieves $3.3$--$3.4\times$ speedup over the single-threaded baseline on postflop streets, with per-iteration time of ${\sim}47$--$54$~ms on a depth-limited game tree with over $1$ billion histories. All experiments run on a single desktop-class device (NVIDIA DGX Spark), enabling hundreds of CFR iterations within a typical real-time decision budget without requiring datacenter-scale infrastructure.