SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
2026-04-06 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors created a new test called SpatialEdit-Bench to better check how well models can change the position and view of objects in images. They made a large, computer-generated dataset named SpatialEdit-500k to help train models with precise control over object placement and camera angles. Using this dataset, they developed SpatialEdit-16B, a model that does better than previous ones at detailed spatial editing while still performing well on general editing tasks. All their tools and data will be shared publicly for others to use.
image spatial editingbenchmarkgeometry-driven transformationdatasetBlendercamera viewpointobject layoutperceptual plausibilitygeometric fidelityspatial manipulation
Authors
Yicheng Xiao, Wenhu Zhang, Lin Song, Yukang Chen, Wenbo Li, Nan Jiang, Tianhe Ren, Haokun Lin, Wei Huang, Haoyang Huang, Xiu Li, Nan Duan, Xiaojuan Qi
Abstract
Image spatial editing performs geometry-driven transformations, allowing precise control over object layout and camera viewpoints. Current models are insufficient for fine-grained spatial manipulations, motivating a dedicated assessment suite. Our contributions are listed: (i) We introduce SpatialEdit-Bench, a complete benchmark that evaluates spatial editing by jointly measuring perceptual plausibility and geometric fidelity via viewpoint reconstruction and framing analysis. (ii) To address the data bottleneck for scalable training, we construct SpatialEdit-500k, a synthetic dataset generated with a controllable Blender pipeline that renders objects across diverse backgrounds and systematic camera trajectories, providing precise ground-truth transformations for both object- and camera-centric operations. (iii) Building on this data, we develop SpatialEdit-16B, a baseline model for fine-grained spatial editing. Our method achieves competitive performance on general editing while substantially outperforming prior methods on spatial manipulation tasks. All resources will be made public at https://github.com/EasonXiao-888/SpatialEdit.