PolyGnosis 2.0: Enhancing LLM Reasoning via Agentic Harness Engineering for Polymarket and OSINT Insight Extraction
2026-05-25 • Computation and Language
Computation and LanguageComputational Engineering, Finance, and Science
AI summaryⓘ
The authors present PolyGnosis 2.0, a system that combines unusual market signals from Polymarket with news data from global sources to find differences in opinions, which can be valuable for predictions. They test various methods to improve the system’s thinking process, like breaking problems into parts and reflecting on answers, to work well in noisy financial environments. Their experiments show that dividing tasks helps but too much reflection can confuse the system, and they discover a common bias toward agreement that needs careful checking. In the end, they find the best setup that balances speed and accuracy for professional-level analysis in prediction markets.
PolymarketOpen Source Intelligence (OSINT)GDELTPerspective Mismatchesmulti-agent systemsHarness Engineeringreflection loopsdivide-and-conquerchain-of-thoughtconsensus bias
Authors
Daren Wang, Hong Xu, Jiawen Xian
Abstract
This paper introduces PolyGnosis 2.0, a pioneering multi-agent architecture designed to extract predictive intelligence by synthesizing Polymarket anomaly signals with global Open Source Intelligence (OSINT) streams, specifically Global Database of Events, Language, and Tone (GDELT). We define and target "Perspective Mismatches", the narrative divergence between Polymarket sentiment and global media flows, as high-alpha trading signals. Moving beyond generic agentic superiority, we rigorously quantify the efficacy of "Harness Engineering" techniques, including reflection loops, tool-calling, divide-and-conquer partitioning (D&C), and chain-of-thought (CoT), within high-noise financial domains. Our empirical evaluation against human-expert benchmarks reveals that while structural partitioning is mandatory for multi-dimensional alignment, unconstrained terminal reflection actively induces logical drift. Furthermore, we identify a pervasive "consensus bias" across all agent configurations during narrative reasoning, necessitating deterministic validation. Ultimately, we isolate a Pareto-optimal configuration that achieves professional-grade analytical precision while minimizing latency and token overhead, providing a robust blueprint for autonomous intelligence in prediction markets.