HapticLDM: A Diffusion Model for Text-to-Vibrotactile Generation

2026-05-11Human-Computer Interaction

Human-Computer InteractionArtificial Intelligence
AI summary

The authors developed HapticLDM, a new method that turns text descriptions into matching vibrations to improve user experiences in areas like games and the metaverse. Unlike previous methods that struggled to capture the full meaning of text due to their step-by-step approach, HapticLDM uses a Latent Diffusion Model to create more accurate and smooth vibrations. They improved the data and processing techniques to better reflect motion and conducted tests showing that their model produces more realistic and well-matched vibrations. Users also found it easier to design haptic effects with this new approach.

Text-to-vibration generationHaptic feedbackLatent Diffusion ModelsAutoregressive modelsTemporal envelopeSemantic alignmentHaptic designMetaverseInteractive scenariosUser study
Authors
Jiahao Xiong, Fei Wang, Anran Xu, Pinzhi Huang, Tao Wen, Lijia Pan, Cai Chen
Abstract
Text-to-vibration generation converts natural language into haptic feedback, enabling vibration-effect designers to get scenarios-fitted vibrations more efficiently, which shows great potentials in application fields such as metaverse, games, and film to enrich the user experience in interactive scenarios. The core challenge in this field is how to generate accurate, consistent, and complete vibrations according to textual semantics. Very recent autoregressive (AR) approaches (e.g., HapticGen) exhibit limited capacity in fully capturing global dependencies, owing to the inherent sequential nature of their modeling and prevailing data constraints. In this paper, we proposed HapticLDM, the first text-to-vibration generative model built upon Latent Diffusion Models (LDMs). Firstly, with respect to the data, we introduced a text-processing strategy that emphasizes dynamic characteristics to curate high-quality data pairs for fine-grained dynamic modeling. Secondly, HapticLDM incorporates a global denoising mechanism that regulates coherent and stable variations in the temporal envelope. Furthermore, we conduct extensive evaluations, including A/B testing against the state-of-the-art baseline and a user study involving 30 participants. The results demonstrate that our model enhances realism and semantic alignment. Qualitative feedback further indicates that HapticLDM simplifies the haptic design workflow while generating diverse, subtle, and physically precise vibrations.