A Doeblin-Anchored Contrastive Chart for Learning Markov Transition Kernels

2026-06-01Machine Learning

Machine Learning
AI summary

The authors present a method to learn Markov transition models that ensures the output behaves like a proper transition kernel, which is important for predicting future states accurately. They introduce a special framework called a Doeblin-anchored contrastive chart that combines the target transition with a restart process to create a stable and invertible representation of transitions. Their approach includes a way to correct learned models so they remain valid Markov kernels and provides theoretical guarantees about learning accuracy and error control. Additionally, they extend their results to handle dependent data sequences and show how small errors in single-step transitions affect long-term outcomes.

Markov transition kernelDoeblin minorizationcontrastive learningconditional density estimationinvertible coordinatesscore inversionMarkovization operatorbeta-mixingoracle inequalitiesHölder–ReLU approximation
Authors
Ao Xu
Abstract
Learning a Markov transition model is not merely conditional density estimation: the learned object must be a valid transition kernel before it is iterated in downstream dynamics. This paper introduces a Doeblin-anchored contrastive chart, a statistical-to-dynamical coordinate framework for learning transition kernels from contrastive objectives. Given a restart law and an anchor strength, the chart mixes the target transition with the restart law. The resulting anchored kernel is simultaneously a Doeblin-minorized Markov kernel, the positive conditional law in a binary contrastive experiment, and an explicitly invertible coordinate for the original transition law. We prove that the anchored contrastive risk identifies the anchored transition density and calibrates excess risk to density error. Since inversion of a learned score may produce a signed or unnormalized object, we introduce a measurable Markovization operator that restores kernel validity while preserving integrated $L^1$ accuracy up to a constant factor. Oracle inequalities and Hölder--ReLU approximation bounds yield nonparametric rates for independent transition pairs. For stationary geometrically $β$-mixing trajectories, a conservative thinning-and-coupling extension yields the same reconstruction interface with an effective sample size. Occupancy-weighted perturbation bounds transfer one-step kernel error to finite-horizon marginal, path-law, and occupation-measure errors under explicit coverage.