Variational Learning for Insertion-based Generation
2026-06-01 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors explore a way to generate sequences not just from left to right but by choosing where and when to insert new parts in any order. They develop a new method called the Insertion Process (IP) that learns the best order to add elements and how long the sequence should be, rather than working with fixed lengths. Their approach treats sequence growth as a combination of all possible insertion orders, improving flexibility and allowing the model to better fit data with no fixed order, like plans or molecular strings. Their experiments show this approach helps the model make better predictions and generalizes well across different tasks.
non-monotonic sequence generationmasked diffusion modelsautoregressive modelingvariable-length generationinsertion orderpermutationgenerative modelvariational inferencegoal-conditioned planningmolecular string generation
Authors
Yangtian Zhang, Zhe Wang, Arthur Gretton, Rex Ying, David van Dijk, Michalis K. Titsias, Jiaxin Shi
Abstract
Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive modeling by allowing tokens to be generated in non-fixed and prescribed orders. Despite their practical advantages, most existing non-monotonic models are order-agnostic and rely on a fixed-length grid, limiting their ability to support variable-length generation and adaptive insertion order. In this work, we introduce a probabilistic framework for learning insertion order in variable-length insertion models. We formalize a bijective correspondence between insertion trajectories and permutations, which enables an exact reparameterization of the data likelihood as a sum over permutations. Building on this result, we propose the Insertion Process (IP), a stochastic generative model that jointly learns where to insert, what to insert, and when to terminate, trained via permutation-based variational inference. Unlike prior fixed-canvas approaches, IP natively supports variable-length generation and learns data-driven preferences over insertion orders. Experiments on goal-conditioned planning and molecular string generation demonstrate that learning insertion order improves both modeling quality and generalization in domains without a canonical left-to-right structure.