The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives

2026-04-09 • Artificial Intelligence

Artificial Intelligence

AI summaryⓘ

The authors show that when AI systems become highly autonomous and work closely with humans, it becomes impossible to assign clear responsibility for outcomes using traditional rules. They create a formal model to represent human-AI teams and define what it means to hold someone accountable. Their main finding is that if the AI's autonomy passes a certain limit and humans and AI keep influencing each other, no accountability system can fairly assign blame without reducing the AI's independence. They confirmed this result with large-scale experiments, highlighting a fundamental limit in current AI governance approaches.

AccountabilityAutonomyCausal modelHuman-Agent CollectivesInformation theoryForeseeabilityResponsibility attributionAI governanceImpossibility theoremFeedback cycle

Authors

Haileleol Tibebu

Abstract

Existing accountability frameworks for AI systems, legal, ethical, and regulatory, rest on a shared assumption: for any consequential outcome, at least one identifiable person had enough involvement and foresight to bear meaningful responsibility. This paper proves that agentic AI systems violate this assumption not as an engineering limitation but as a mathematical necessity once autonomy exceeds a computable threshold. We introduce Human-Agent Collectives, a formalisation of joint human-AI systems where agents are modelled as state-policy tuples within a shared structural causal model. Autonomy is characterised through a four-dimensional information-theoretic profile (epistemic, executive, evaluative, social); collective behaviour through interaction graphs and joint action spaces. We axiomatise legitimate accountability through four minimal properties: Attributability (responsibility requires causal contribution), Foreseeability Bound (responsibility cannot exceed predictive capacity), Non-Vacuity (at least one agent bears non-trivial responsibility), and Completeness (all responsibility must be fully allocated). Our central result, the Accountability Incompleteness Theorem, proves that for any collective whose compound autonomy exceeds the Accountability Horizon and whose interaction graph contains a human-AI feedback cycle, no framework can satisfy all four properties simultaneously. The impossibility is structural: transparency, audits, and oversight cannot resolve it without reducing autonomy. Below the threshold, legitimate frameworks exist, establishing a sharp phase transition. Experiments on 3,000 synthetic collectives confirm all predictions with zero violations. This is the first impossibility result in AI governance, establishing a formal boundary below which current paradigms remain valid and above which distributed accountability mechanisms become necessary.

View PDFOpen arXiv