SidewalkBench: Benchmarking Visual Navigation on Urban Sidewalks
2026-06-15 • Robotics
Robotics
AI summaryⓘ
The authors created SidewalkBench, a new testing platform to help computers learn to navigate sidewalks in cities, which are tricky because of lots of obstacles and moving people. They built this platform using NVIDIA Isaac Sim, allowing fast simulations of realistic sidewalk environments with different pedestrian behaviors. They tested nine navigation models on various difficult tasks and found that dealing with people and long trips are still major challenges. They also suggest that training with more fake data might improve these models.
visual navigationurban sidewalkssimulationNVIDIA Isaac Simpedestrian behaviorbenchmarkrobot navigationsynthetic datalong-horizon tasksprocedural generation
Authors
Zhizheng Liu, Honglin He, Vivek Alumootil, Akshat Pandya, Brad Squicciarini, Wayne Wu, Bolei Zhou
Abstract
Urban sidewalk navigation presents significant challenges due to complex structural layouts, dynamic pedestrian behaviors, and long distances. While recent visual navigation models offer a promising solution, the lack of a unified benchmark hinders quantitative and reproducible evaluation. To bridge this gap, we propose SidewalkBench, a comprehensive benchmark designed for visual navigation on urban sidewalks. Built upon NVIDIA Isaac Sim, SidewalkBench brings GPU-accelerated simulation of diverse, high-fidelity sidewalk environments, including both procedurally generated and real-world scanned scenes. We further populate the scenes with rich, reactive event-based pedestrian behaviors and flexible, efficient animation, enabling standardized model evaluation under realistic real-world settings. We conduct a comprehensive evaluation of 9 visual navigation models on 330 unit-test scenarios, 800 pedestrian-reactive scenarios, and 105 long-horizon scenarios. Our findings highlight that pedestrian interaction and long-horizon robustness remain critical bottlenecks for existing models, and scaling up sidewalk training with synthetic data emerges as a promising solution.