CityTrajBench: A Unified Benchmark for City-Scale Vehicle Trajectory Generation

2026-06-01Machine Learning

Machine LearningArtificial Intelligence
AI summary

The authors created CityTrajBench, a framework to fairly compare different methods that generate urban vehicle trajectories, which are important for transportation and city planning. They standardized every step from data preparation to evaluation, allowing various models—like those based on statistical methods, VAEs, GANs, diffusion, and flow matching—to be tested on real city data. Their tests showed that each type of model excels in different areas, so no single model is best at everything. This work helps researchers better understand and improve urban trajectory generation by providing a clear and consistent testing environment.

urban trajectory generationtransportation simulationvariational autoencoder (VAE)generative adversarial network (GAN)diffusion modelsflow matchingbenchmarkingtrajectory evaluationMarkov modelsmobility analytics
Authors
Shibo Zhu, Xiaodan Shi, Dayin Chen, Yuntian Chen, Haoran Zhang, Tianhao Wu, Jinyue Yan
Abstract
Urban trajectory generation is a fundamental task for transportation simulation, urban planning, and mobility analytics. However, systematic comparison across trajectory generation methods remains difficult because existing studies often rely on different datasets, preprocessing pipelines, trajectory representations, and evaluation metrics. This fragmentation makes it unclear whether reported performance differences arise from the generation mechanism itself or from inconsistent experimental protocols. To address this issue, we present CityTrajBench, a unified benchmark framework and protocol for city-scale vehicle trajectory generation. CityTrajBench standardizes data ingestion, trajectory normalization, feature construction, model adaptation, map-aware post-processing, model selection, and multi-level evaluation under a common setting. It supports heterogeneous generators, including statistical baselines, VAE-based, GAN-based, diffusion-based, and flow-matching-based models, and evaluates them on three real-world urban trajectory datasets. The benchmark measures global spatial realism, trip-level distribution fidelity, trajectory-level geometric similarity, conditional mobility consistency, and efficiency. Experiments reveal clear trade-offs across model families: DiffTraj is strongest on trajectory-level geometric fidelity, DiffRNTraj is competitive on structure-sensitive global realism, and TrajFlow provides a strong balance across realism, quality, conditional consistency, and efficiency. Meanwhile, a simple Markov baseline remains competitive on coarse-grained trip and local-movement statistics. These findings show that urban trajectory generation quality is inherently multi-objective, that no single model dominates all criteria equally, and that CityTrajBench provides a reproducible benchmark protocol and testbed for future research on urban mobility generation.