Multi-Agentic System Leveraging Open-Source LLMs to Mitigate Disinformation Threats

2026-06-29 • Computation and Language

Computation and Language

AI summaryⓘ

The authors address the growing problem of disinformation by creating an automated system that mimics how human fact-checkers decide what information is true or false. Their system uses multiple AI agents with different knowledge and thinking styles working together in a structured way to improve accuracy. They tested this approach using different language datasets, finding it works better than single AI models like GPT-4. The method also uses open AI models to promote transparency in how it works. This helps detect false information, decide which texts need checking, and find factual claims that can be verified.

disinformationfact-checkingmulti-agent systemlarge language modelsconsensus mechanismhierarchical structureopen AI modelsnatural language processinglow-resource languagesautomated verification

Authors

Sebastian Kula, Martin Tamajka

Abstract

In contemporary societies, the threat of disinformation has reached alarming levels, exacerbated by the proliferation of electronic communication, social media, and advancements in artificial intelligence. As a result, there is an urgent need to develop effective countermeasures to mitigate this menace. However, the sheer scale of the problem renders manual fact-checking and human-based verification inadequate, underscoring the necessity for automated methods to detect and debunk disinformation. This article proposes a novel approach based on a multi-agent system that emulates the decision-making processes of human annotators engaged in disinformation detection tasks. By incorporating a consensus mechanism, diversity in cognition and diversity in knowledge, and also hierarchical structure, inspired by human annotators' behavior, the proposed method achieves superior results compared to individual Large Language Models (LLMs), including GPT 4 and GPT 3.5. The system leverages open models (e.g., LLaMA, Kimi, Qwen, Deepseek and LLaMA-Nemotron) to ensure greater transparency. The evaluation of the proposed method encompasses datasets in languages with varying resource availability, including English (high-resource), Polish (medium-resource), Slovak (low-resource) and Bulgarian (low-resource). Experiments were conducted on tasks such as direct disinformation detection, identification of texts worthy of verification, and detection of texts containing verifiable factual claims.

View PDFOpen arXiv