Stratifying Reinforcement Learning with Signal Temporal Logic

2026-04-06 • Machine Learning

Machine LearningLogic in Computer Science

AI summaryⓘ

The authors present a new way to understand Signal Temporal Logic (STL) by linking it to stratification theory, where STL formulas help divide space-time into layers. They show this approach can analyze how deep reinforcement learning (DRL) models organize their learned representations and relate these to decision-making geometry. To demonstrate their ideas, they study simple game environments called Minigrid and apply their methods to a DRL agent’s internal data, using the STL formulas' robustness as feedback. Their work also suggests efficient tools to explore these layered structures in complex data spaces.

Signal Temporal Logicstratification theoryspace-time stratificationdeep reinforcement learningembedding spaceMinigrid gamesrobustnesslatent embeddingshigh-dimensional analysiscomputational signatures

Authors

Justin Curry, Alberto Speranzon

Abstract

In this paper, we develop a stratification-based semantics for Signal Temporal Logic (STL) in which each atomic predicate is interpreted as a membership test in a stratified space. This perspective reveals a novel correspondence principle between stratification theory and STL, showing that most STL formulas can be viewed as inducing a stratification of space-time. The significance of this interpretation is twofold. First, it offers a fresh theoretical framework for analyzing the structure of the embedding space generated by deep reinforcement learning (DRL) and relates it to the geometry of the ambient decision space. Second, it provides a principled framework that both enables the reuse of existing high-dimensional analysis tools and motivates the creation of novel computational techniques. To ground the theory, we (1) illustrate the role of stratification theory in Minigrid games and (2) apply numerical techniques to the latent embeddings of a DRL agent playing such a game where the robustness of STL formulas is used as the reward. In the process, we propose computationally efficient signatures that, based on preliminary evidence, appear promising for uncovering the stratification structure of such embedding spaces.

View PDFOpen arXiv