Fast and Lightweight Novel View Synthesis with Differentiable Multiplane Image
2026-06-01 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionArtificial Intelligence
AI summaryⓘ
The authors look at a way to create new views of a scene from a few pictures, called novel view synthesis. They point out that popular methods like Neural Radiance Fields and 3D Gaussian Splatting can be slow and large, especially when only a few views are available. To fix this, they use a simpler representation called Multiplane Image (MPI) layers, starting with predicted points to help build the scene and then improving it with a special one-step diffusion process to fill gaps and reduce artifacts. Their method is faster and smaller than a similar approach, while still producing good quality images from front views.
Novel View SynthesisNeural Radiance Fields (NeRF)3D Gaussian Splatting (3DGS)Multiplane Image (MPI)Visual Foundation ModelsDifferentiable OptimizationOne-step DiffusionSparse-view Conditions
Authors
Kaidi Zhang, Guanxu Zhu
Abstract
Recently, novel view synthesis has witnessed remarkable progress, with mainstream methods such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) delivering impressive results. However, these approaches often struggle to balance rendering speed and model size, and their optimization-based training can be highly time-consuming. Furthermore, they typically rely on dense observations, often failing to produce satisfactory results under sparse-view conditions. Although feed-forward reconstruction significantly reduces the optimization time of 3DGS, its pixel-aligned formulation generates millions of Gaussians from a single image, severely limiting its practical deployment on mobile devices. To address these limitations, we revisit the Multiplane Image(MPI) representation, which represents scenes using a compact set of planar layers for efficient novel view synthesis. Leveraging recent advances in visual foundation models, we utilize predicted point maps for reliable geometric initialization, followed by differentiable optimization. To address the issues of holes and artifacts in sparsely initialized MPI, we introduce one-step diffusion, which participates in both the differentiable optimization of MPI and the postprocessing of rendering results. Compared with a representative GS-based method, our approach is 30.7% faster and uses only 14.8% of its model size, while achieving competitive synthesis quality on front-view scenarios