A Generative Model for Closed-Loop Microsimulation of Signalized Intersections

2026-06-22Robotics

RoboticsArtificial Intelligence
AI summary

The authors developed Enactor, a new model that simulates how vehicles move and interact at traffic lights, focusing on each vehicle individually. Unlike previous models that only look a short time ahead and don't handle interactions well, Enactor predicts vehicle movements over time using a transformer method and learns by seeing its own predictions. Testing showed Enactor closely matches real traffic data, reduces red-light running, and works better than simple models on real-world data. The authors also found a specific feature (leader rear-bumper) is especially important for safe vehicle behavior at intersections.

Traffic microsimulationTrajectory predictionTransformer modelIntersection dynamicsClosed-loop trainingActor-centric modelingRed-light violationSignalized intersectionKL divergenceMulti-horizon prediction
Authors
Yash Ranjan, Rahul Sengupta, Anand Rangarajan, Sanjay Ranka
Abstract
Traffic microsimulators rely on hand-crafted behavior models that reproduce aggregate flow but miss the heterogeneous interactions between vehicles at signalized intersections. Learned trajectory predictors capture richer interactions but are short-horizon and tend to be unstable when run in closed loop. We present Enactor, an actor-centric generative model for closed-loop intersection microsimulation. The model focuses on vehicles; pedestrians are included as context that can influence vehicle decisions but not predicted. Dynamic actors and lane polylines are encoded in polar coordinates referenced to the intersection center. A transformer with separate spatial and temporal attention blocks predicts a distribution over each actor's next-step motion ($s$, $α$). Training uses a closed-loop curriculum so the model is exposed to its own predictions. We evaluate Enactor in two regimes. In a 4000-second simulation-in-the-loop test at two intersection geometries, Enactor controls every dynamic vehicle against a continuously refreshing actor set rather than the fixed cohort that learned trajectory predictors are usually evaluated against. It recovers the SUMO data generator's speed and travel-time distributions with KL divergence over an order of magnitude lower than a recent transformer baseline on travel time, and substantially lower on speed (roughly $5\times$ lower at Site 1), and reduces red-light violations relative to the same baseline by more than an order of magnitude. An ablation isolates the leader rear-bumper feature as the change with the largest effect on intersection-aware safety metrics. We also evaluate on real-world field data and apply the same architecture to naturalistic vehicle trajectories from a fish-eye camera at a signalized intersection and evaluate it on multi-horizon predictive tasks. Enactor outperforms a constant-velocity baseline at every horizon evaluated.