SOAP-Bubbles: Structured Weight Uncertainty for Neural Networks

2026-06-22Machine Learning

Machine Learning
AI summary

The authors found a way to improve how neural networks estimate uncertainty in their weights without making the training much slower or more complicated. They combined two existing methods, SOAP and IVON, to create a new approach called EVON that captures more complex relationships in the uncertainty. This new method works well for tasks like logistic regression and language model training, giving better results than simpler uncertainty estimates. Essentially, their work helps make advanced uncertainty tracking more practical for big deep learning models.

structured weight uncertaintySOAP optimizerIVONdiagonal covariancenon-diagonal covariancevariational inferencepreconditionerlogistic regressionlanguage model pretrainingdeep learning optimization
Authors
Adrian Robert Minut, Nico Daheim, Marco Miani, Mohammad Emtiyaz Khan, Wu Lin, Thomas Möllenhoff
Abstract
Structured weight-uncertainty can improve many aspects of deep learning, but it remains costly to estimate and difficult to implement. Here, we show that these issues can be addressed by adapting the SOAP optimizer. Our key idea is to run IVON, an existing diagonal-covariance variational method, in the eigenspace of SOAP's preconditioner and then use the preconditioner to transform the diagonal estimate into a non-diagonal covariance. The resulting method has costs similar to those of SOAP and requires no drastic changes to training pipelines. We call the posteriors obtained in this way SOAP-Bubbles and our new optimizer Eigenspace-VON (EVON). We show that, for logistic regression, EVON recovers the exact Gaussian covariance and that, for language model pretraining, it yields significantly better results than existing diagonal-covariance methods. Our work makes it easier to estimate more expressive posterior distributions for deep learning at scale.