Analytical Exploration of Spatial Audio Cues: A Differentiable Multi-Sphere Scattering Model

2026-03-02Sound

Sound
AI summary

The authors address the difficulty of creating systems that can tell where sounds come from underwater, which is tricky because water and soft tissues let sound pass through similarly, unlike air and human heads. They developed a new mathematical model that predicts how sound bounces around layered, partly see-through spheres with solid scatterers inside, mimicking underwater animals' anatomy. This model is fully differentiable, allowing it to work with machine learning to improve sound source localization even in noisy conditions. They also show it can track moving sounds accurately using a special filter. Their approach could help build better underwater microphones by using sound scattering, not just traditional methods.

spatial hearingsound scatteringHead-Related Transfer Functioninteraural time differenceinteraural level differencedifferentiable modelingfrequency weightingExtended Kalman Filterunderwater acousticsmachine learning
Authors
Siminfar Samakoush Galougah, Pranav Pulijala, Ramani Duraiswami
Abstract
A primary challenge in developing synthetic spatial hearing systems, particularly underwater, is accurately modeling sound scattering. Biological organisms achieve 3D spatial hearing by exploiting sound scattering off their bodies to generate location-dependent interaural level and time differences (ITD/ILD). While Head-Related Transfer Function (HRTF) models based on rigid scattering suffice for terrestrial humans, they fail in underwater environments due to the near-impedance match between water and soft tissue. Motivated by the acoustic anatomy of underwater animals, we introduce a novel, analytically derived, closed-form forward model for scattering from a semi-transparent sphere containing two rigid spherical scatterers. This model accurately maps source direction, frequency, and material properties to the pressure field, capturing the complex physics of layered, penetrable structures. Critically, our model is implemented in a fully differentiable setting, enabling its integration with a machine learning algorithm to optimize a cost function for active localization. We demonstrate enhanced convergence for localization under noise using a physics-informed frequency weighting scheme, and present accurate moving-source tracking via an Extended Kalman Filter (EKF) with analytically computed Jacobians. Our work suggests that differentiable models of scattering from layered rigid and transparent geometries offer a promising new foundation for microphone arrays that leverage scattering-based spatial cues over conventional beamforming, applicable to both terrestrial and underwater applications. Our model will be made open source.