PeReGrINE: Evaluating Personalized Review Fidelity with User Item Graph Context

2026-04-09 • Information Retrieval

Information RetrievalComputation and Language

AI summaryⓘ

The authors created PeReGrINE, a tool to test AI systems that write personalized product reviews by using connected data about users and items. They organized review data into a graph showing relationships and time order, and used a special User Style Parameter to capture each user's writing style instead of just their past reviews. Their system compares different ways to gather information for generating reviews and uses a new method called Dissonance Analysis to check how well the AI matches expected user style and product opinions. They also explored adding pictures to help the AI, finding that while images sometimes improve text, the connected graph data is the key for personalization.

Personalized review generationGraph-structured dataUser-item bipartite graphUser Style ParameterRetrieval-conditioned language modelsTemporal consistencyDissonance AnalysisAmazon Reviews datasetVisual evidence in NLPEvaluation framework

Authors

Steven Au, Baihan Lin

Abstract

We introduce PeReGrINE, a benchmark and evaluation framework for personalized review generation grounded in graph-structured user--item evidence. PeReGrINE restructures Amazon Reviews 2023 into a temporally consistent bipartite graph, where each target review is conditioned on bounded evidence from user history, item context, and neighborhood interactions under explicit temporal cutoffs. To represent persistent user preferences without conditioning directly on sparse raw histories, we compute a User Style Parameter that summarizes each user's linguistic and affective tendencies over prior reviews. This setup supports controlled comparison of four graph-derived retrieval settings: product-only, user-only, neighbor-only, and combined evidence. Beyond standard generation metrics, we introduce Dissonance Analysis, a macro-level evaluation framework that measures deviation from expected user style and product-level consensus. We also study visual evidence as an auxiliary context source and find that it can improve textual quality in some settings, while graph-derived evidence remains the main driver of personalization and consistency. Across product categories, PeReGrINE offers a reproducible way to study how evidence composition affects review fidelity, personalization, and grounding in retrieval-conditioned language models.

View PDFOpen arXiv