Improved Predictive Performance and Interpretability for Mesomorphic Neural Networks Using Local Fidelity Regularization
2026-06-29 • Machine Learning
Machine Learning
AI summaryⓘ
The authors study a special type of neural network called Interpretable Mesomorphic Neural Networks (IMNs), which aim to be both accurate and understandable. They found a problem where the explanations these networks give can be unreliable because the network can put too much importance on a single part, making the interpretation meaningless. To fix this, they created a new technique called Local Fidelity Regularization (LFR) that keeps the explanations trustworthy by connecting the model's weights better with the data's local patterns. Their tests show that this method improves explanation quality without hurting, and even slightly improves, prediction accuracy.
Interpretable Neural NetworksMesomorphic NetworksLocal Fidelity RegularizationL1 PenaltyModel InterpretabilityAUROCBlack-box ModelsPredictive PerformanceExplainability
Authors
Hugo L. Hammer, Vajira Thambawita, Kristoffer Herland Hellton, Pål Halvorsen
Abstract
Interpretable Mesomorphic Neural Networks (IMNs) offer a promising framework that combines the predictive power of deep neural networks with the interpretability of linear models. However, the original formulation lacks safeguards to ensure that the learned interpretations are in fact reliable. In particular, the network is free to concentrate all explanatory variance into a single weight of the linear output layer, achieving strong predictive performance while producing interpretations that are largely meaningless. Paradoxically, the L1 penalty proposed to encourage sparse solutions exacerbates this problem by further incentivizing such degenerate configurations. To address this vulnerability, we introduce Local Fidelity Regularization (LFR), a novel penalty term that prevents degenerate weight collapse by aligning the linear output weights with local data variations. This structural constraint guarantees faithful explanations and substantially improves the reliability of model interpretations. Furthermore, empirical evaluations across the OpenML benchmark suite demonstrate that LFR does not compromise accuracy for explainability; rather, it achieved improved AUROC over the unregularized IMN. By yielding results highly competitive with state-of-the-art black-box models, LFR provides the dual benefit of reliable interpretability and superior predictive performance. Source code and usage instructions are available at https://github.com/hugohammer/LFR-IMN.git.