Context-Driven Incremental Compression for Multi-Turn Dialogue Generation

2026-06-10 • Computation and Language

Computation and LanguageMachine Learning

AI summaryⓘ

The authors address the problem that conversational AI systems slow down as they remember more and more past dialogue, which can cause mistakes or loss of important information. They propose a new method called Context-Driven Incremental Compression (C-DIC) that organizes conversations into threads and updates summaries for each thread to keep memory compact and accurate. This method also uses a learning technique to understand long-term dependencies without needing to look at the entire conversation history all at once. Their experiments show that C-DIC works better and faster on long conversations compared to previous methods.

conversational agentscontext compressiondialogue historyincremental compressionperplexitytruncated backpropagation-through-timelong-term dependenciescontext threadsinference latencydialogue modeling

Authors

Yeongseo Jung, Jaehyeok Kim, Eunseo Jung, Jiachuan Wang, Yongqi Zhang, Ka Chun Cheung, Simon See, Lei Chen

Abstract

Modern conversational agents condition on an ever-growing dialogue history at each turn, incurring redundant attention and encoding costs that grow with conversation length. Naive truncation or summarization degrades fidelity, while existing context compressors lack cross-turn memory sharing or revision, causing information loss and compounding errors in long dialogues. We revisit the context compression under conversational dynamics and empirically present its fragility. To improve both efficiency and robustness, we introduce Context-Driven Incremental Compression (C-DIC), which treats a conversation as interleaved contextual threads and stores revisable per-thread compression states in a single, compact dialogue memory. At each turn, a lightweight retrieve, revise, and write-back loop shares information across turns and updates stale memories, stabilizing long-horizon behavior. In addition, we adapt truncated backpropagation-through-time (TBPTT) to our multi-turn setting, learning cross-turn dependencies without full-history backpropagation. Extensive experiments on long-form dialogue benchmarks demonstrate superior performance and efficiency of C-DIC; notably, C-DIC shows stable inference latency and perplexity over hundreds of dialogue turns, supporting a scalable path to high-quality dialogue modeling.

View PDFOpen arXiv