Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability
2026-06-03 • Machine Learning
Machine Learning
AI summaryⓘ
The authors study how certain transformations in neural network parameters don't change what the network actually does, called parameter symmetries. They create a framework to understand the set of functions a single neuron can produce and how hard it is to produce them. Their work shows that even in complex models without obvious symmetry, many nearly equivalent solutions exist. They also find that neurons can be identified across different training runs, which helps combine different networks without extra adjustments, and they explore when this combination can be done smoothly. Overall, the authors link these ideas to how the network's training landscape behaves.
parameter symmetriesneural networksmode connectivitytraining dynamicsfunction classesneuron identifiabilityloss landscaperepresentation merging
Authors
Vincent Bürgin, Daniel Herbst, Ya-Wei Eileen Lin, Stefanie Jegelka
Abstract
Many striking phenomena in deep learning, such as linear mode connectivity and the structured behavior of training dynamics, are closely tied to parameter symmetries: transformations that leave the realized function unchanged. Despite growing attention to parameter symmetries, the exact interplay between parameters, data, and representations remains underexplored. To investigate this, we develop a theoretical framework of effective function classes, i.e., the set of functions a neuron can realize on its input support, and the norm cost of realizing them. We then formalize effective symmetry breaking via neuron identifiability across independent training runs. Our analysis shows that neural networks can admit large families of approximately equivalent solutions even in structurally asymmetric models. We further show that neuron identifiability enables representation merging without prior alignment, and characterize when such merging admits a linear low-loss path. These findings highlight the role of effective function classes in affecting the loss landscape.