Optimal Deterministic Multicalibration and Omniprediction

2026-06-18Machine Learning

Machine Learning
AI summary

The authors study a special way to check if machine learning models are fair and unbiased called multicalibration, which ensures predictions stay accurate across many groups. Previously, the best methods to achieve this needed randomness and lots of data, while deterministic (non-random) methods required much more data. The authors solve this problem by creating the first deterministic method that uses the least possible data to reach multicalibration. They also extend their approach to related fairness notions and solve other open questions about creating reliable prediction tools.

multicalibrationcalibrationdeterministic predictorsrandomized algorithmssample complexityoutcome indistinguishabilityminimax optimalitymachine learning fairnessomnipredictorspanpredictors
Authors
Georgy Noarov, Aaron Roth
Abstract
A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for many downstream applications and is a basic desideratum of trustworthy machine learning. Before this work, all predictors known to attain the minimax-optimal $\widetilde O(\varepsilon^{-3})$ sample complexity rate for $\varepsilon$-multicalibration were randomized, while deterministic predictors were known only with substantially worse sample complexity. Whether randomization is necessary for optimal sample complexity in multicalibration was explicitly asked by [CLNR26] and implicitly in several prior works. We resolve this open problem by giving a minimax-optimal multicalibration algorithm that outputs a deterministic predictor. We then generalize the algorithm to produce optimal deterministic predictors that satisfy outcome indistinguishability (OI) with respect to finite or finitely covered collections of tests. As an application, this also gives deterministic omnipredictors and panpredictors with optimal sample complexity, resolving open problems posed by [OKK25] and [BHHLZ25].