Task-Induced Representational Invariances Depend on Learning Objective in Deep RL
2026-06-01 • Machine Learning
Machine Learning
AI summaryⓘ
The authors studied how two popular reinforcement learning (RL) methods, DQN and PPO, learn to understand their environments differently. They found that DQN focuses on patterns that don't change when the environment's states are grouped in certain ways, while PPO focuses on patterns that stay the same when different actions are swapped. These differences affect how well the methods transfer what they learn to new situations and also show up in language models depending on how they are prompted. Their work helps compare how various RL algorithms represent information and might offer clues about how animals' brains process learning.
Reinforcement LearningDeep Q-Network (DQN)Proximal Policy Optimization (PPO)Markov Decision Process (MDP)MDP HomomorphismAction SymmetriesRepresentation LearningTransfer LearningNeural CodingLarge Language Models (LLMs)
Authors
Manu Srinath Halvagal, Sebastian Lee, SueYeon Chung
Abstract
Reinforcement Learning (RL) has long served as a model for goal-directed animal behavior in neuroscience. Modern deep RL has shown remarkable success across many domains, further strengthening this connection. The ability to learn abstract representations of high-dimensional state spaces underlies much of this success. However, theoretical understanding of these learned representations remains limited, hindering direct comparisons between models and animal learning. We address this gap by analyzing deep RL representations through the lens of MDP reduction theory. Investigating canonical RL algorithms in a navigation task, we find that even when performance is comparable, the value-based method (DQN) learns representations that are invariant to MDP homomorphism symmetries, while the policy-gradient method (PPO) learns representations invariant to action symmetries. These differences emerge consistently across domains, have downstream consequences for transfer learning, and appear in LLMs in a prompt-dependent manner. Our findings provide a principled approach to comparing learned representations across RL algorithms, with demonstrated practical implications and possible insights for neural coding in the brain.