FedLAB: Traceable Semantic Codebooks for Federated Multimodal Graph Foundation Learning
2026-06-30 • Machine Learning
Machine Learning
AI summaryⓘ
The authors study how to teach computers to understand complex graphs that include text, images, and connections, even when data is spread across many places that can't share raw information for privacy reasons. They introduce FedLAB, a new method that groups and organizes knowledge about different types of data and their connections in a way that keeps track of why decisions are made. This method learns by combining information from different places without sharing the original data, improving performance on various tasks while allowing people to trace back the reasoning behind predictions. Their tests show that FedLAB works better than other leading methods.
multimodal graphsfederated learningsemantic codebooknode semanticstopology contextrepresentation learningprivacy constraintssemantic traceabilityfederated pre-traininggraph foundation models
Authors
Zekai Chen, Kairui Yang, Xuaner Chen, Xunkai Li, Xun Wu, Rong-Hua Li, Guoren Wang
Abstract
Multimodal graph foundation models aim to learn reusable knowledge from graphs enriched with text, images, attributes, and relational topology, thereby supporting diverse graph-centric and modality-centric tasks. In practice, however, such multimodal graphs are often distributed across decentralized clients, where raw contents and local structures cannot be centrally shared due to privacy constraints. This motivates federated multimodal graph foundation learning, which requires not only transferable representation learning but also intrinsic semantic traceability under strict data isolation. Existing methods usually exchange or store knowledge through parameters, prototypes, embeddings, or compact codebooks, which support optimization and transfer but do not explicitly expose how modality evidence, node semantics, and topology context jointly support predictions. To bridge this gap, we propose FedLAB, a traceable semantic codebook framework that organizes multimodal graph knowledge into typed hierarchical codebooks for modality evidence, node semantics, and topology context. FedLAB further refines these trace units through federated semantic barycenter pre-training while keeping raw multimodal contents and graph structures local. Extensive experiments on 10 benchmarks and 6 downstream tasks show that FedLAB improves over state-of-the-art baselines by up to 7.53\%, while preserving a native semantic trace interface.