Two-Fidelity Best-Action Identification for Stochastic Minimax Tree

2026-06-01 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors study how to efficiently find the best action in decision trees where outcomes have randomness, a problem important in AI planning. They introduce a new method called 2FFS that balances using cheap but biased guesses with costly but accurate checks during the search. The method mixes two styles of searching and decides when to use each type of evaluation to save time and computation. They prove their method is correct, stops in finite time, and performs well in experiments compared to older approaches.

best-action identificationstochastic minimax treesfixed-confidenceMonte Carlo Tree Search (MCTS)multi-fidelity methodsbandit algorithmsheuristic evaluationAI planningcomputational costalgorithmic correctness

Authors

Peter Chen, Xi Chen

Abstract

We study fixed-confidence best-action identification (BAI) in stochastic minimax trees. This problem is increasingly relevant in modern AI planning, where deep minimax search and Monte Carlo Tree Search (MCTS) with language model long rollouts face a fundamental tradeoff: heuristic evaluations are cheap but biased, while accurate rollouts are reliable but prohibitively expensive. We propose 2FFS, a two-fidelity tree-search algorithm that brings multi-fidelity flat bandit ideas into trees. The algorithm combines minimax-style fast expansion with MCTS-style stochastic sampling, adaptively deciding when to exploit cheap biased evaluations and when to invoke expensive accurate evaluations for local certification. We prove fixed-confidence correctness, establish finite stopping for exact identification, and give a polynomial-depth cost upper bound for general-depth trees. Across numerical stochastic-tree experiments, 2FFS uses substantially fewer samples and computational operations comparing to existing BAI-MCTS baseline.

View PDFOpen arXiv