How Much Do Reviews Really Contribute? A Study on Text-Enriched Matrix Factorization for Recommendations

2026-06-15Information Retrieval

Information RetrievalArtificial Intelligence
AI summary

The authors study how adding text from user reviews helps improve recommendation systems that predict ratings. They try different ways to combine text information with traditional collaborative data, including special methods to decide how much to rely on text. Their experiments show that, while these methods make the model more flexible, the extra text does not significantly improve accuracy beyond what traditional collaborative data already provides. This means, according to the authors, the usual rating-based information remains more important for making recommendations.

Recommender SystemCollaborative FilteringMatrix FactorizationTextual ReviewsTopic ProfilesEmbeddingGating MechanismCross-AttentionRating PredictionSemantic Information
Authors
Eduardo Ferreira da Silva, Mayki dos Santos Oliveira, Joel Machado Pires Denis Dantas Boaventura, Frederico Araújo Durão
Abstract
Incorporating textual reviews into a Recommender System has become a prominent strategy for enriching collaborative signals with semantic information. However, the actual contribution of review-derived representations remains an open question, particularly when strong collaborative baselines are employed. In this work, we systematically investigate the impact of textual information on Matrix Factorization by introducing and comparing three enrichment strategies over a common collaborative backbone. First, we propose a learnable gating mechanism that adaptively balances collaborative and textual signals during training. This mechanism is applied to two distinct review representations: (i) aggregated topic profiles extracted from user and item histories, and (ii) full text embedding representations derived from reviews. Additionally, we explore a cross-attention mechanism that identifies and emphasizes the most informative dimensions of the textual representation before fusion with collaborative factors. We evaluate six variants: pure, enriched with topic profiles and text via gating; enriched with topics and text via gating; and enhanced with cross-attention over textual features. Experiments across multiple review-based datasets reveal that although adaptive fusion mechanisms improve representation flexibility, the marginal contribution of textual signals remains limited compared to the collaborative backbone. These findings suggest that, under typical rating-prediction settings, collaborative information continues to dominate performance, raising important considerations for the effective integration of semantic review signals into recommendation models.