Trivium: Temporal Regret as a First-Class Objective for Causal-Memory Controllers

2026-06-03 • Artificial Intelligence

Artificial IntelligenceMachine Learning

AI summaryⓘ

The authors explain that many AI systems fix mistakes only by checking if the final result is right or wrong, which misses understanding when and why errors happen repeatedly. They introduce three kinds of regrets—outcome, epistemic, and temporal—that help track what mistake occurred, why it happened, and how long it lasts before being fixed. Their approach uses this system to better learn causal relationships and reduce repeated errors over time. They also test their method, called Trivium, showing it improves long-term performance compared to methods that only look at outcomes. This method revises the AI's understanding of cause-and-effect without retraining its internal model weights.

outcome regretepistemic regrettemporal regretcausal modellong-horizon learningLLMintervention channelprobe complexitychange-pointsself-learning

Authors

Edward Y. Chang

Abstract

Many current agentic systems and LLM pipelines correct mistakes by optimizing outcome reward. This addresses only the what of failure: when an outcome diverges from prediction, the why and when of the mismatch are not systematically logged, reviewed, or corrected, so the same error can recur episode after episode. We argue that this is a structural problem, not merely a model-capacity one. We propose long-horizon temporal regret as a first-class objective alongside outcome regret and epistemic regret over the working causal model. Temporal regret captures when failure persists: how long a miscalibrated causal model is tolerated before correction. Epistemic regret captures why failure persists: residual uncertainty or error in the working causal model. Together, the three regrets give a falsifiable account of what, why, and when a long-lived agent can fail. Modeling the agent as a stream of E episodes, we prove three conditional results under explicit causal-probing, persistence, and detectability assumptions. First, under observationally equivalent confounding, outcome-only learning cannot distinguish causal from spurious structure without an intervention channel, so temporal miscalibration can persist linearly even after outcome regret is driven to zero. Second, with a persistent causal log and budgeted probes, total probe complexity is logarithmic in the episode horizon, inducing O(log E) temporal regret. Third, under K detectable change-points, the rate extends to O(K log E). We instantiate Trivium and pre-register five falsifiable predictions. On CausalBench-Seq, Trivium follows the predicted logarithmic envelope while outcome-only baselines grow linearly. A pilot real-LLM stream provides preliminary external-validity evidence across one full E = 500 run and three E = 100 frontier-model pilots. Self-learning here means revising an external causal model, not retraining LLM weights.

View PDFOpen arXiv