Probing Routing-Conditional Calibration in Attention-Residual Transformers

2026-05-11Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors investigate whether internal routing information from Attention-Residual transformers helps improve post-hoc calibration of model predictions beyond just using confidence scores. They develop tests comparing calibration errors with and without this routing data and find that routing information does not provide stable or reliable improvements. Even advanced models using routing features do not outperform models using confidence alone once proper controls are applied. The authors conclude that supposed benefits from routing-aware calibration are likely due to confounding factors rather than true internal-state improvements.

post-hoc calibrationlogitssoftmax confidenceAttention-Residual transformerrouting tracesexpected calibration error (ECE)Nadaraya-Watson estimatormatched-confidence diagnosticpermutation testmachine learning model calibration
Authors
Wenhao Liang, Lin Yue, Wei Emma Zhang, Miao Xu, Mingyu Guo, Olaf Maennel, Weitong Chen
Abstract
Post-hoc calibration is usually evaluated as a function of logits or softmax confidence alone, even as routing-augmented architectures increasingly accompany predictions with sample-specific internal routing traces and pair them with claims of calibration-relevant uncertainty. We ask a basic question: do these traces provide stable routing-specific evidence for post-hoc calibration beyond confidence? We study this in Attention-Residual transformers (Kimi Team, 2026) through a matched-confidence diagnostic suite that stratifies examples by routing-derived state, compares subgroup gaps against within-bin routing-permutation nulls, and evaluates matched post-hoc probes differing only in their auxiliary feature. Across our completed AR runs, scalar routing summaries do not provide stable evidence of routing-conditional miscalibration: weighted gaps remain small or seed-sensitive, and only $1$ of $30$ within-bin permutation tests rejects the conditional-null at $α=0.05$ (only on one seed; not stable across seeds in that cell). AR-CondCal, a minimal $2$-D Nadaraya--Watson probe on confidence and routing-depth variance, lies within the seed-variance band of matched confidence-only and predictive-entropy controls and does not reliably improve worst-routing-tertile ECE; bandwidth-sensitivity checks (Scott multiples, CV-NLL, global-ECE oracle) do not change this. A full-vector MLP over $(c, H_1, \ldots, H_L)$ can appear to improve over a linear confidence baseline, but the apparent gain disappears once a capacity-matched confidence-only MLP is included as a control, and shuffled routing profiles achieve comparable performance. Apparent routing-aware calibration gains in this AR setting should not be read as internal-state calibration until matched-confidence, bandwidth, capacity, and permutation controls rule out common confounds.