3DReflecNet: A Large-Scale Dataset for 3D Reconstruction of Reflective, Transparent, and Low-Texture Objects

2026-05-11Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors created a huge new dataset called 3DReflecNet to help computers better understand and recreate 3D objects that are shiny, see-through, or have very plain surfaces. These types of objects are usually hard for current 3D reconstruction methods because they break usual assumptions computers rely on. Their dataset includes both computer-generated and real-world objects, with lots of different shapes and lighting scenarios. The authors tested existing methods on this dataset and found that they often do not work well, showing that improved techniques are needed.

3D reconstructionreflective surfacestransparent materialsmulti-view reconstructionphotometric consistencyphysically-based renderingstructure-from-motionnovel view synthesisreflection removalrelighting
Authors
Zhicheng Liang, Haoyi Yu, Boyan Li, Dayou Zhang, Zijian Cao, Tianyi Gong, Junhua Liu, Shuguang Cui, Fangxin Wang
Abstract
Accurate 3D reconstruction of objects with reflective, transparent, or low-texture surfaces still remains notoriously challenging. Such materials often violate key assumptions in multi-view reconstruction pipelines, such as photometric consistency and the availability on distinct geometric texture cues. Existing datasets primarily focus on diffuse, textured objects, and therefore provide limited insight into performance under real-world material complexities. We introduce 3DReflecNet, a large-scale hybrid dataset exceeding 22 TB that is specifically designed to benchmark and advance 3D vision methods for these challenging materials. 3DReflecNet combines two types of data: over 120,000 synthetic instances generated via physically-based rendering of more than 12,000 shapes, and over 1,000 real-world objects captured using consumer devices. Together, these data consist of more than 7 million multi-view frames. The dataset spans diverse materials, complex lighting conditions, and a wide range of geometric forms, including shapes generated from both real and LLM-synthesized 2D images using diffusion-based pipelines. To support robust evaluation, we design benchmarks for five core tasks: image matching, structure-from-motion, novel view synthesis, reflection removal, and relighting. Extensive experiments demonstrate that state-of-the-art methods struggle to maintain accuracy across these settings, highlighting the need for more resilient 3D vision models.