Judgment-Grounded Expansion for Peer Review Generation

2026-06-22Computation and Language

Computation and Language
AI summary

The authors explore a middle ground between fully automated and fully manual scientific review writing by creating a system where a human reviewer gives a key judgment, and the AI expands it into detailed review comments. They treat this as a step-by-step process of generating, checking, and refining the AI's suggestions. To improve and test this approach, they simulate interactions and develop ways to manage multiple generated comment options efficiently. Their work lays foundational ideas for future tools that combine human insight and AI support in writing scientific reviews.

automatic review generationhuman-AI collaborationjudgment-grounded expansiongenerate-check-refineconformal predictioncandidate set curationscalable evaluationscientific peer review
Authors
Sheng Lu, Lizhen Qu, Iryna Gurevych
Abstract
Automatic review generation is a promising direction for accelerating scientific progress. While most work adopts an end-to-end setup, its fully automated nature makes it less suitable for settings that demand accountability. To better balance automation and accountability, we formalize judgment-grounded expansion, a human-AI collaboration mode where a reviewer provides an evaluative claim and the system expands it into review comment candidate(s). We model it as a structured generate-check-refine process and conduct a user study to collect human-model interaction data. We study two practical challenges for judgment-grounded expansion: scalable evaluation and candidate set curation. We develop methods to simulate the process for large-scale evaluation, and show that conformal prediction is well suited to balancing candidate set size and target coverage. Our work establishes judgment-grounded expansion as a concrete task and provides empirical and methodological foundations for the design of future collaborative review generation systems.