Clarify, Abstain or Answer? Strategising in Conversation with Belief-Augmented Generation

2026-05-25Computation and Language

Computation and LanguageArtificial IntelligenceMachine Learning
AI summary

The authors studied how large language models (LLMs) handle uncertainty when answering questions. They introduced a method called Belief-Augmented Generation (BAG), which makes LLMs use multiple possible answers to decide whether to answer directly, ask for clarification, or say they don’t know. They found that without BAG, LLMs usually don’t clarify or say they abstain, even when unsure. BAG helped improve accuracy and made the models' choices better reflect their actual uncertainty, although it's still hard to tell when they should clarify versus when to abstain.

large language modelsuncertainty representationprobabilistic belief statesamplingquestion answeringpromptingselective predictionclarificationabstentionmulti-turn conversation
Authors
Joris Baan, Wilker Aziz, Barbara Plank, Raquel Fernández
Abstract
Large language models (LLMs) define a distribution over text, which can be viewed as a probabilistic representation of uncertainty: sampling K responses yields a belief state - responses a model deems plausible. Existing work exploits this representation for narrow tasks like either decoding or selective prediction, and often requires manual interventions, not controlling generation directly. We propose Belief-Augmented Generation (BAG): grounding LLMs in their own belief state via the prompt and letting them reason over these K samples to decide on a conversational strategy: answer, clarify, or abstain. In a multi-turn ambiguous QA setting, we find that LLMs by default rarely clarify or abstain, ignoring uncertainty about the input or facts. BAG improves QA accuracy across six models and yields strategy decisions more faithful to the belief state than prompt-only baselines. Disentangling when to clarify from when to abstain, however, remains challenging.