WildDepth: A Multimodal Dataset for 3D Wildlife Perception and Depth Estimation

2026-03-17 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionDigital Libraries

AI summaryⓘ

The authors created WildDepth, a new dataset that combines images and LiDAR data to better estimate depth and 3D shapes of animals, both domestic and wild. Previous models mostly used images without scale information, which made it hard to check accuracy. Their experiments show that using both types of data improves how well machines can guess distances and build 3D models of animals. By sharing WildDepth, the authors hope to help develop better systems that understand animals in different environments.

depth estimation3D reconstructionLiDARRGB imagesmultimodal datasetmetric scalebehavior detectioncomputer visionChamfer distanceroot mean square error (RMSE)

Authors

Muhammad Aamir, Naoya Muramatsu, Sangyun Shin, Matthew Wijers, Jiaxing Jhong, Xinyu Hou, Amir Patel, Andrew Markham

Abstract

Depth estimation and 3D reconstruction have been extensively studied as core topics in computer vision. Starting from rigid objects with relatively simple geometric shapes, such as vehicles, the research has expanded to address general objects, including challenging deformable objects, such as humans and animals. However, for the animal, in particular, the majority of existing models are trained based on datasets without metric scale, which can help validate image-only models. To address this limitation, we present WildDepth, a multimodal dataset and benchmark suite for depth estimation, behavior detection, and 3D reconstruction from diverse categories of animals ranging from domestic to wild environments with synchronized RGB and LiDAR. Experimental results show that the use of multi-modal data improves depth reliability by up to 10% RMSE, while RGB-LiDAR fusion enhances 3D reconstruction fidelity by 12% in Chamfer distance. By releasing WildDepth and its benchmarks, we aim to foster robust multimodal perception systems that generalize across domains.

View PDFOpen arXiv