In-Context Graphical Inference

2026-06-03Machine Learning

Machine LearningComputation and LanguageSymbolic Computation
AI summary

The authors address a common problem in graphical models where either exact methods are too slow, or fast methods don't always work well. They propose a new approach called In-Context Graphical Inference (ICG-I) that imitates a step-by-step elimination process using a special Graph Transformer. This method compresses calculations and adds mechanisms to ensure reliable uncertainty estimates, even if the data changes. Their experiments show ICG-I performs better than previous methods, especially on difficult problems where others fail.

Graphical modelsMarginal inferenceVariable eliminationGraph TransformerTensor-Train compressionDirichlet-Multinomial lossWeighted Conformal PredictionTopological shiftBelief PropagationSpin glasses
Authors
Zehua Cheng, Wei Dai, Jiahao Sun
Abstract
Marginal inference in discrete graphical models forces a choice between exactness and scalability: exact algorithms are intractable for high-treewidth graphs, while iterative approximations (Belief Propagation, variational methods) sacrifice convergence guarantees on frustrated topologies. We argue that this dichotomy stems from a mismatched inductive bias: iterative methods abandon the sequential elimination structure that makes exact inference correct. We introduce In-Context Graphical Inference (ICG-I), an autoregressive Graph Transformer that restores this structure by mimicking Variable Elimination with learned, Tensor- Train-compressed intermediate factors, paired with a Dirichlet output layer and Weighted Conformal Prediction for calibrated, distribution-free coverage guarantees under topological shift. We prove that TT compression errors propagate at most lincarly through the autoregressive chain, that the Dirichlet-Multinomial loss is a proper scoring rule, and that WCP maintains coverage with a quantifiable degradation under estimated density ratios. We conducted intensive experiments to evaluate ICG-I and achieved state-of-the-art performance across all benchmarks. ICG-I reduces MAE from 0.041 (best baseline) to 0.020 on standard instances and achieves 0.048 on N=500 frustrated spin glasses where BP diverges entirely.