MINTS: Minimalist Thompson Sampling

2026-06-01 • Artificial Intelligence

Artificial IntelligenceMachine Learning

AI summaryⓘ

The authors propose a new Bayesian approach for decision-making that focuses only on guessing where the best option is, instead of modeling all underlying details. This method simplifies handling extra rules or structures in the problem. They demonstrate their approach through a technique called MINimalist Thompson Sampling (MINTS) and prove it makes smart choices efficiently in multi-armed bandit problems, even when there are constraints. Their method matches known theoretical limits when there are no extra structures and adapts well when the best option has a simple pattern.

Bayesian methodssequential decision-makingmulti-armed banditThompson Samplingprofile likelihoodstructural constraintsregret boundsLai-Robbins constantunimodal bandits

Authors

Kaizheng Wang

Abstract

The Bayesian paradigm offers principled tools for sequential decision-making under uncertainty, but its reliance on a probabilistic model for all parameters can hinder the incorporation of complex structural constraints. We introduce a minimalist Bayesian framework that places a prior only on the location of the optimum, while eliminating nuisance parameters through profile likelihood. This yields a generalized posterior that naturally accommodates structural constraints. As a direct instantiation, we develop MINimalist Thompson Sampling (MINTS). For multi-armed bandits with mean constraints, we establish near-optimal non-asymptotic regret guarantees and sharp almost-sure asymptotic regret characterizations. In particular, MINTS attains the classical Lai--Robbins constant in the unstructured setting and automatically adapts to unimodal structure, achieving the sharp constant determined only by the immediate neighbors of the optimal arm.

View PDFOpen arXiv