Towards Faithful Multimodal Concept Bottleneck Models

2026-03-13Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMachine Learning
AI summary

The authors study concept bottleneck models (CBMs), which use understandable concepts to make decisions, in situations where inputs come from multiple sources like images and text. They point out that for these models to be reliable, they need to detect concepts well and avoid mixing unrelated information into the decision, a problem called leakage. Their new method, f-CBM, tackles both problems together by using a special loss function and a powerful prediction component. Tests show that f-CBM balances accuracy, good concept detection, and less leakage across different types of data.

Concept Bottleneck ModelsMultimodal LearningConcept LeakageConcept DetectionVision-Language ModelsDifferentiable LossKolmogorov-Arnold NetworkInterpretabilityPredictive Accuracy
Authors
Pierre Moreau, Emeline Pineau Ferrand, Yann Choho, Benjamin Wong, Annabelle Blangero, Milan Bhan
Abstract
Concept Bottleneck Models (CBMs) are interpretable models that route predictions through a layer of human-interpretable concepts. While widely studied in vision and, more recently, in NLP, CBMs remain largely unexplored in multimodal settings. For their explanations to be faithful, CBMs must satisfy two conditions: concepts must be properly detected, and concept representations must encode only their intended semantics, without smuggling extraneous task-relevant or inter-concept information into final predictions, a phenomenon known as leakage. Existing approaches treat concept detection and leakage mitigation as separate problems, and typically improve one at the expense of predictive accuracy. In this work, we introduce f-CBM, a faithful multimodal CBM framework built on a vision-language backbone that jointly targets both aspects through two complementary strategies: a differentiable leakage loss to mitigate leakage, and a Kolmogorov-Arnold Network prediction head that provides sufficient expressiveness to improve concept detection. Experiments demonstrate that f-CBM achieves the best trade-off between task accuracy, concept detection, and leakage reduction, while applying seamlessly to both image and text or text-only datasets, making it versatile across modalities.