Bridge the Gaps: Heterogeneous Attributed Graph Clustering via Quaternion Representation Learning

2026-06-22 • Machine Learning

Machine Learning

AI summaryⓘ

The authors address the problem of grouping nodes in graphs that have different types of information, like numbers and categories, making it hard to learn useful patterns. They identify two main issues: nodes becoming too similar after repeated processing (over-smoothing) and important attribute details getting overshadowed by the graph structure (over-dominating). To fix this, they propose AGREE, a new method that better combines different data types, improves how information interacts using quaternion math, and uses simpler graph models to avoid problems. Their approach learns useful node representations for clustering without needing to know the number of groups in advance and performs well on various tests.

Attributed graph clusteringGraph topologyAttribute heterogeneityOver-smoothingOver-dominatingQuaternion graph convolutionGraph representation learningGraph reconstructionShallow graph architecturesMulti-level alignment

Authors

Xinxi Chen, Junyang Chen, Yiqun Zhang, Chuangming Qiu, Xiang Zhang

Abstract

Attributed graph clustering partitions nodes by jointly exploiting node attributes and graph topology. It remains challenging due to attribute heterogeneity and representation degradation during graph learning. Real-world datasets often contain heterogeneous attributes, i.e., numerical and categorical attributes, complicating unified representation learning. This challenge becomes more complex in attributed graphs, where constructing a clustering-friendly graph structure from attributes and topology remains difficult. Under deep graph architectures, repeated graph propagation causes node embeddings to become overly similar, leading to the over-smoothing (OS) effect. Meanwhile, graph representation learning amplifies topological influence, making discriminative attribute information harder to exploit for clustering, an effect we refer to as over-dominating (OD). To bridge these gaps, an end-to-end framework, Any-type attributed Graph REpresentation lEarning (AGREE), is proposed. It unifies attributed graphs and any-type attributed data through multi-level alignment and similarity-based graph construction. Quaternion-based graph convolution strengthens attribute interaction to alleviate OD, while shallow graph architectures help relieve OS. The learned embeddings are jointly optimized for graph reconstruction and clustering, without requiring a predefined number of clusters during training. Experiments on diverse benchmarks show that AGREE achieves strong overall performance in accuracy, robustness, and adaptability.

View PDFOpen arXiv