ContextClaim: A Context-Driven Paradigm for Verifiable Claim Detection

2026-03-31Computation and Language

Computation and Language
AI summary

The authors look at how to better detect if a statement is a factual claim that can be checked against evidence. Instead of just using the claim's text, they introduce a method called ContextClaim that adds background information by pulling related facts from Wikipedia. They test this approach on different datasets and with various AI models to see if adding context helps. They find that including extra information can sometimes improve the detection of verifiable claims, but the success depends on the topic and model used. The authors also analyze when and why this context is helpful.

verifiable claim detectionfact-checkingentity extractionWikipedialarge language modelscontext augmentationzero-shot learningfew-shot learningCOVID-19 Twitter datasetpolitical debate dataset
Authors
Yufeng Li, Rrubaa Panchendrarajan, Arkaitz Zubiaga
Abstract
Verifiable claim detection asks whether a claim expresses a factual statement that can, in principle, be assessed against external evidence. As an early filtering stage in automated fact-checking, it plays an important role in reducing the burden on downstream verification components. However, existing approaches to claim detection, whether based on check-worthiness or verifiability, rely solely on the claim text itself. This is a notable limitation for verifiable claim detection in particular, where determining whether a claim is checkable may benefit from knowing what entities and events it refers to and whether relevant information exists to support verification. Inspired by the established role of evidence retrieval in later-stage claim verification, we propose Context-Driven Claim Detection (ContextClaim), a paradigm that advances retrieval to the detection stage. ContextClaim extracts entity mentions from the input claim, retrieves relevant information from Wikipedia as a structured knowledge source, and employs large language models to produce concise contextual summaries for downstream classification. We evaluate ContextClaim on two datasets covering different topics and text genres, the CheckThat! 2022 COVID-19 Twitter dataset and the PoliClaim political debate dataset, across encoder-only and decoder-only models under fine-tuning, zero-shot, and few-shot settings. Results show that context augmentation can improve verifiable claim detection, although its effectiveness varies across domains, model architectures, and learning settings. Through component analysis, human evaluation, and error analysis, we further examine when and why the retrieved context contributes to more reliable verifiability judgments.