ATN3D: Density-Aware LiDAR-Radar Early 3D Object Detection Under Extreme Sparsity

2026-06-08Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors focus on improving 3D object detection for self-driving cars, especially for detecting things far away where sensor data is very sparse. They identify two main problems: mixing sensor data too early can cause mistakes, and training usually ignores small or distant objects. To fix this, they propose ATN3D, a system that smartly combines LiDAR and Radar data by paying attention to how dense the data points are and the surrounding trustworthy areas. Their method also adjusts to weather and distance during training to detect far objects better. Tests show that ATN3D detects distant objects more accurately, even in foggy conditions.

3D object detectionLiDARRadarmultimodal fusionsparse sensinglong-range detectionself-attentionoccupancy gridrange-aware lossautonomous vehicles
Authors
Debojyoti Biswas, Xianbiao Hu
Abstract
3D object detection is the backbone of perception for automated vehicles (AV) and broader intelligent transportation systems applications. Long-range detection is challenging because sensing evidence is sparse; yet this ``long-range'' scenario is routine in traffic. Although >30m is often labeled long-range in computer vision, on roadways it affords only approx. 1-2s for perception and decision-making. Under such extreme sparsity, two core challenges arise. First, early multimodal fusion tends to discard sparsity information and inject noise from empty or falsely occupied cells, degrading long-range recall. Second, context-agnostic uniform channel supervision favors dense and near-range samples, leaving far and small objects under-optimized, delaying the earliest detection of distant objects. We propose ``Ask The Neighbor'' (ATN3D), a LiDAR-Radar framework tailored for sparse-range conditions. ATN3D introduces (i) Density-aware early fusion with cross-modal gating that conditions fusion on per-voxel density/sparsity and Radar evidence, (ii) Occupancy-gated neighborhood aggregation with circular kernels to aggregate only from credible cells, (iii) Evidence-conditioned channel self-attention to adapt channel weights with weather/range, and (iv) a Range-aware loss that re-balances classification and localization by distance, aligning training with distance-stratified evaluation. On the VoD benchmark across clear and foggy conditions, ATN3D surpasses strong baselines: +3.55% mAP in clear weather and +8.41% mAP under simulated heavy fog; for >30m objects, gains are +3.33% (clear) and +2.09% (heavy fog). These results indicate earlier and more reliable long-range detections under sparse sensing in on-road traffic.