IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking

2026-06-08 • Computation and Language

Computation and Language

AI summaryⓘ

The authors find that current large language models struggle to write very long texts without losing quality, especially past about 2,000 words. They say this happens because existing planning methods are too fixed and can't adjust well as the text grows longer. To fix this, they created a new approach called IS-CoT that mixes planning, writing, and reflecting repeatedly during writing, allowing the model to adapt as it goes. They built a special dataset and trained a model called IS-Writer-8B, which performs better on long writing tasks than previous models and stays more coherent even at long lengths.

Large Language ModelsLong-form content generationHierarchical planningChain-of-ThoughtDynamic planningText coherenceLength collapseMulti-teacher pipelineIS-CoTIS-Writer-8B

Authors

Zechen Sun, Yuyang Sun, Zecheng Tang, Juntao Li, Wenpeng Hu, Wenliang Chen, Zhunchen Luo, Guotong Geng, Min Zhang

Abstract

Generating coherent and controllable long-form content remains a persistent challenge for Large Language Models (LLMs). While reasoning-enhanced models have demonstrated success in logic-intensive domains, our evaluation reveals that they suffer from a severe length collapse in open-ended writing, where performance degrades sharply as target lengths exceed 2,000 words. We attribute this failure to the limitation of static hierarchical planning, which struggles to provide dynamic guidance over extended contexts. To bridge this gap, we introduce the Interleaved Structural Chain-of-Thought (IS-CoT) framework. Unlike external agentic workflows, IS-CoT embeds a dynamic Plan-Write-Reflect cycle into the generation process, enabling continuous strategy adaptation and global alignment without additional assistance. Based on this framework, we construct a high-quality dataset of interleaved reasoning traces via a multi-teacher pipeline and train IS-Writer-8B. Experiments demonstrate that IS-Writer-8B achieves state-of-the-art performance on challenging long-form benchmarks (e.g., +3.08 vs. DeepSeek-V3.2 on LongBench-Write), exhibiting robust length compliance and coherence competitive with significantly larger proprietary models.

View PDFOpen arXiv