R5DGS: Semantic-Aware 4D Gaussian Splatting with Rigid Body Constraints for Efficient Dynamic Scene Reconstruction
2026-05-25 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors developed R5DGS, a system that improves how computers understand and predict moving 3D scenes from multiple video angles. They combined physics-based models with special identity tags that link parts of the scene to specific objects, making it easier to find and track these objects over time. Their method uses text-based searches to identify objects and speeds up predictions by focusing physics calculations on object centers rather than every detail. This approach makes future scene predictions faster while keeping the movement believable.
3D scene reconstructionGaussian Splatting4D Gaussian representationCLIP modelobject identity encodingrigid-body dynamicsopen-vocabulary text promptingmulti-view videostrajectory predictionphysics-informed modeling
Authors
Denis Gridusov, Maxim Popov, Sergey Kolyubin
Abstract
Reconstructing and predicting dynamic 3D scenes from multi-view videos is a foundational task for robotics, AR/VR, and digital twins. Recent physics-informed Gaussian Splatting methods achieve impressive future frame extrapolation but lack semantic awareness and suffer from large computational overhead. We introduce $\textbf{R5DGS}$, a framework that augments a physics-driven 4D Gaussian representation with compact Identity Encoding vectors, enabling precise Gaussian-to-object association. By constructing an offline CLIP-based object lookup table, we support open-vocabulary text prompting to retrieve and render object-specific Gaussians across arbitrary timestamps and viewpoints. Furthermore, we propose a rigid-body inference constraint that predicts and integrates physical dynamics exclusively for object centroids, propagating motion to associated Gaussians via relative transformations. This optimization yields a 11 FPS speedup during extrapolation without compromising trajectories plausibility.