DCGrasp: Distance-aware Controllable Grasp Generation
2026-06-29 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors introduce DCGrasp, a system that creates realistic 3D images of hands holding objects. Their method uses a new way to measure how close each part of the hand is to the object, helping to capture how hands naturally interact with objects. DCGrasp can be controlled by users and works well with many different hand and object shapes. The system first guesses a grasp based on this distance information, then fine-tunes it to make the grasp look natural and physically possible. This approach could help in areas like robotics and virtual reality where hands and objects need to interact realistically.
3D hand-object interactiongrasp generationdistance profilediffusion transformerphysical plausibilitycontrollable synthesisoptimizationnear-contact regionsgeneralizationhand pose
Authors
Hiroyasu Akada, Jesús Pérez, Emre Aksan, Vasileios Choutas, Cristian Romero, Alberto Garcia-Garcia, Vladislav Golyanik, Christian Theobalt, Thabo Beeler
Abstract
Generating 3D hand-object interactions is essential for applications in robotics, XR, and synthetic data generation, where flexible controllability and strong generalization to diverse object geometries are required. However, existing methods rarely satisfy these requirements, limiting their practical applicability. We present DCGrasp, a distance-aware controllable grasp generation system built on a novel grasp energy term. This term computes Distance Profile, a signed distance from each hand vertex to the nearest object point, coupled with distance-aware weighting, effectively capturing the semantically similar hand-object interaction in near-contact regions while remaining invariant to object and hand identity. Given various controllable signals, DCGrasp first generates a Distance Profile based on a Diffusion Transformer, together with a corresponding candidate hand pose. We then refine the candidate pose through optimization, enforcing consistency between the optimized hand pose and the generated Distance Profile in near-contact regions. Our experiments show that DCGrasp produces high-quality, physically plausible grasps with flexible user control, generalizing to diverse object and hand shapes and scales. Our work establishes a robust and versatile pipeline for the synthesis of controllable 3D hand-object interactions.