DyCoRM: Dynamic Criterion-Aware Reward Modeling for Text-to-Image Generation

2026-05-25Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors focus on improving how computers judge the quality of images made from text descriptions by understanding specific user preferences. They created DyCoRM, a model that considers different detailed criteria to better match what users want. To test and train their approach, they built new datasets and a benchmark to evaluate these dynamic criteria. Their work is the first to offer a flexible and detailed way to assess generated images based on changing user needs.

text-to-image generationreward modelsuser preferencedynamic criteriafine-grained evaluationpreference comparisondatasetbenchmarkcriterion-aware modelingimage quality assessment
Authors
Jiaying Qian, Ziheng Jia, Qian Zhang, Zicheng Zhang, Jiayi Guo, Junqi Zhang, Guangtao Zhai, Xiongkuo Min
Abstract
With the continued advancement of text-to-image (T2I) generation, producing high-quality images is becoming increasingly attainable; consequently, user demands are shifting toward images that better satisfy their specific requirements. As reward models play an increasingly important role in assessing whether generated images align with user preference, this trend introduces an important challenge for reward modeling: rather than relying solely on static and general evaluation dimensions, reward models should account for the task-relevant and fine-grained criteria through which users assess whether generated images meet their specific requirements. To address this challenge, we propose DyCoRM, a dynamic, criterion-aware reward model that grounds task-relevant criteria and performs criterion-aware preference comparison. To support this setting, we construct DyCoDataset-20K, which provides dynamic criteria together with criterion-level annotations, and further derive DyCoBench-1K, a benchmark for systematically evaluating reward models under dynamic criteria. We further introduce DyCoPick, which applies criterion-aware reward modeling to selecting T2I images. Our contributions establish the first reward modeling framework for dynamic and fine-grained evaluation and practical application in T2I generation.