C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts

2026-04-13 • Computation and Language

Computation and LanguageArtificial Intelligence

AI summaryⓘ

The authors focus on the challenge of detecting AI-generated text in Chinese, where existing tools have limitations like similar data types and few models tested. They created C-ReD, a new benchmark with varied data and realistic prompts, to improve the detection of AI-written Chinese content. Their tests show C-ReD works well both on known and new AI models, filling important gaps in previous efforts. The resource is publicly available for others to use.

Large Language ModelsAI-generated Text DetectionChinese NLPBenchmark DatasetModel GeneralizationPrompt RealismPhishingAcademic DishonestyData HomogeneityIn-domain Detection

Authors

Chenxi Qing, Junxi Wu, Zheng Liu, Yixiang Qiu, Hongyao Yu, Bin Chen, Hao Wu, Shu-Tao Xia

Abstract

Recently, large language models (LLMs) are capable of generating highly fluent textual content. While they offer significant convenience to humans, they also introduce various risks, like phishing and academic dishonesty. Numerous research efforts have been dedicated to developing algorithms for detecting AI-generated text and constructing relevant datasets. However, in the domain of Chinese corpora, challenges remain, including limited model diversity and data homogeneity. To address these issues, we propose C-ReD: a comprehensive Chinese Real-prompt AI-generated Detection benchmark. Experiments demonstrate that C-ReD not only enables reliable in-domain detection but also supports strong generalization to unseen LLMs and external Chinese datasets-addressing critical gaps in model diversity, domain coverage, and prompt realism that have limited prior Chinese detection benchmarks. We release our resources at https://github.com/HeraldofLight/C-ReD.

View PDFOpen arXiv