From Detecting Agency to Doing Work: Self-Caused Credit Builds a Durable Behavioral Self in a Minimal Spiking Agent
2026-06-29 • Artificial Intelligence
Artificial IntelligenceMachine LearningNeural and Evolutionary Computing
AI summaryⓘ
The authors studied how an agent can develop lasting behaviors that reflect its sense of self. They found that when the agent updates its learning slowly based on a combination of self-detection, agency, and importance signals, it retains learned behaviors even after memory buffers are cleared. This mechanism, called agency-gated slow credit, helps prevent forgetting across multiple tasks without needing extra memory tricks. Their work suggests that slow, self-related credit assignment is key for creating agents that maintain a durable sense of their own behavior, but they do not claim this implies consciousness.
agency detectionslow credit assignmentspiking neural networksepisodic memorytask forgettingplasticityself-preservationmultiplicative gatingreinforcement learningNengo LIF/PES model
Authors
Haoliang Han
Abstract
How does an agent that can tell self from world come to be durably shaped by that distinction? Recent work shows that a predictive system can detect its own agency (Ye, 2026), but detecting agency does not explain durable, self-shaped behavior. We show that agency-gated slow credit -- a conjunctive term Own*Agency*Salience driving a slow parameter update -- produces post-unload behavioral residue: on a spiking substrate (Nengo LIF/PES), a learned self-preserving choice survives episodic buffer removal (retained fraction 0.96, N=50) and collapses when the slow decoders are reset or the agency gate is removed. Reproducing the agency comparator and toggling only the slow-credit channel, we find a clean dissociation: at matched agency gain, durable behavior develops only when self-credit performs slow work (post-unload self-preservation 1.00 vs 0.00). The same dissociation holds in 24-dimensional partially-observed control (0.74 vs 0.00), and a plastic-work analysis shows that basin deformation equals net self-credit work. Across eight sequentially-learned tasks under exogenous interference, the multiplicative veto also prevents forgetting: it retains old tasks (final post-unload accuracy 0.88, forgetting 0.13) where additive pooling collapses to chance-level recall, the no-agency ablation falls below chance, and episodic/replay baselines stay near chance after unload -- all with no replay buffer and no task-boundary-dependent protection mechanism (N=50). We formalize the durable residue as an operational behavioral self and argue that self-caused credit doing slow work is a necessary building block for agents that develop a self. No claim of consciousness is made.