DAR: Deontic Reasoning with Agentic Harnesses

2026-06-03 • Computation and Language

Computation and LanguageArtificial Intelligence

AI summaryⓘ

The authors study how large language models (LLMs) handle deontic reasoning, which means applying specific rules to answer questions like legal or tax decisions. They note it can be hard for these models to find the right rules when the rules are long and connected. To improve this, the authors created Deontic Agentic Reasoning (DAR), where the model can actively look up rules as needed. They tested DAR and found it helps on difficult tasks but sometimes weaker models struggle, especially with number-related problems and use more processing power.

deontic reasoninglarge language modelsrulesetagentic reasoningstatutesDeonticBenchnumerical taskstoken consumption

Authors

Guangyao Dou, William Jurayj, Nils Holzenberger, Benjamin Van Durme

Abstract

Deontic reasoning is the task of answering questions by applying explicit rules and policies to case-specific facts, for example computing tax liability under a statute or determining the outcome of an immigration appeal. A key technical challenge for LLM-based deontic reasoning is that the relevant ruleset can be long and cross-referenced, so models may still fail to locate the rules needed for a particular reasoning step. We introduce Deontic Agentic Reasoning (DAR), an agentic reasoning setup in which the model interacts with the statutes on demand. We evaluate DAR under multiple harnesses on hard subsets of DeonticBench. Across these settings, we find that agentic harnesses can push the frontier on deontic reasoning tasks, but improvements are not uniform: weaker models often degrade on numerical tasks while consuming far more tokens.

View PDFOpen arXiv