Beyond Dense Connectivity: Explicit Sparsity for Scalable Recommendation

2026-04-09Information Retrieval

Information Retrieval
AI summary

The authors found that in recommendation models, most parts of the network don't really help because they focus on too many unimportant connections. They suggest a new method called SSR that explicitly ignores less useful parts of the input by filtering dimensions before combining them. This approach, using both fixed random filters and a dynamic selection process inspired by biology, helps the model focus on important signals. Tests on various datasets show SSR performs better and keeps improving with scale, unlike traditional dense models that stop getting better.

recommender systemsclick-through rate (CTR)sparsitymulti-layer perceptron (MLP)dimension-level filteringstructural sparsitydynamic sparsitybio-inspired competitionscalabilityindustrial datasets
Authors
Yantao Yu, Sen Qiao, Lei Shen, Bing Wang, Xiaoyi Zeng
Abstract
Recent progress in scaling large models has motivated recommender systems to increase model depth and capacity to better leverage massive behavioral data. However, recommendation inputs are high-dimensional and extremely sparse, and simply scaling dense backbones (e.g., deep MLPs) often yields diminishing returns or even performance degradation. Our analysis of industrial CTR models reveals a phenomenon of implicit connection sparsity: most learned connection weights tend towards zero, while only a small fraction remain prominent. This indicates a structural mismatch between dense connectivity and sparse recommendation data; by compelling the model to process vast low-utility connections instead of valid signals, the dense architecture itself becomes the primary bottleneck to effective pattern modeling. We propose \textbf{SSR} (Explicit \textbf{S}parsity for \textbf{S}calable \textbf{R}ecommendation), a framework that incorporates sparsity explicitly into the architecture. SSR employs a multi-view "filter-then-fuse" mechanism, decomposing inputs into parallel views for dimension-level sparse filtering followed by dense fusion. Specifically, we realize the sparsity via two strategies: a Static Random Filter that achieves efficient structural sparsity via fixed dimension subsets, and Iterative Competitive Sparse (ICS), a differentiable dynamic mechanism that employs bio-inspired competition to adaptively retain high-response dimensions. Experiments on three public datasets and a billion-scale industrial dataset from AliExpress (a global e-commerce platform) show that SSR outperforms state-of-the-art baselines under similar budgets. Crucially, SSR exhibits superior scalability, delivering continuous performance gains where dense models saturate.