Ocean4D: Generative Underwater 4D Reconstruction via Medium-Aware Video Diffusion
2026-06-22 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors address the difficulty of capturing 3D scenes underwater over time because water distorts light and moving particles cause problems. They created a system called Ocean4D that improves underwater 4D reconstruction by combining two parts: one that keeps geometry consistent across time and views, and another that reduces the effects of water absorption and scattering on the images. Their method works using just a single video and can generate stable videos from different viewpoints. Tests show their approach works better than existing methods for both moving and still underwater scenes.
underwater 4D reconstructionlight transportabsorptionbackscattermonocular videolatent diffusiongeometry consistencymedium-aware denoisingcross-view consistencydynamic scenes
Authors
Yuqiang Huang, Yuxi Wang, Junyu Dong, Zhaoxiang Zhang
Abstract
Underwater 4D reconstruction remains challenging due to the coupling between degraded light transport in participating media and dynamic water variations. Most existing Methods are developed under in-air assumptions and do not explicitly account for underwater absorption and backscatter. Additionally, near-static assumptions make these approaches sensitive to drifting particles and dynamic distractors , leading to unstable geometry and inconsistent cross-view results. To address these issues, we propose a generative framework for underwater 4D reconstruction, named Ocean4D, which is built on two complementary components. Specifically, 4D-GCC constructs 4D geometrically consistent conditioning with improved cross-frame coverage, while the Medium-Aware Block performs implicit medium-aware denoising in the latent diffusion process to stabilize underwater appearance under absorption and scattering. Given a monocular video and target cameras, our method generates videos along the target trajectories while preserving global structure and cross-view consistency. Extensive experiments on both dynamic and static underwater benchmarks demonstrate state-of-the-art performance on underwater reconstruction.