Not All Points Are Equal: Uncertainty-Aware 4D LiDAR Scene Synthesis
2026-06-01 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionRobotics
AI summaryⓘ
The authors created a new method called U4D to build 4D scenes from LiDAR data more accurately. Instead of treating all parts of a scan the same, they focus first on the harder, uncertain areas like far surfaces or small objects, then fill in the easier parts afterward. They use a system to measure uncertainty and guide the process, plus a special block to keep the scene consistent over time. Their tests show improved scene quality and smoothness compared to previous methods.
LiDAR4D scene reconstructionShannon Entropydiffusion modelsspatial uncertaintytemporal consistencysegmentornuScenesSemanticKITTI
Authors
Xiang Xu, Alan Liang, Youquan Liu, Xian Sun, Linfeng Li, Lingdong Kong, Ziwei Liu, Qingshan Liu
Abstract
Constructing faithful 4D worlds from LiDAR-acquired sequences is crucial for embodied AI, yet current generative frameworks apply uniform modeling capacity across all spatial regions. This ignores that perceptual difficulty varies dramatically within a single scan: distant surfaces, occluded boundaries, and small-scale objects carry far higher uncertainty than well-observed structures. We present U4D, a new framework that explicitly leverages spatial uncertainty to guide LiDAR scene generation in a "hard-to-easy" schedule. U4D derives per-point uncertainty maps via Shannon Entropy from a pretrained segmentor, then applies an unconditional diffusion stage to synthesize high-entropy areas with precise geometry, followed by a conditional completion stage that fills in the remaining regions using these structures as priors. A MoST (Mixture of Spatio-Temporal) block further maintains cross-frame coherence by dynamically balancing spatial detail and temporal continuity. Extensive experiments on nuScenes and SemanticKITTI demonstrate state-of-the-art scene fidelity, temporal consistency, and downstream performance.