GraspGen-X: Cross-Embodiment 6-DOF Diffusion-based Grasping

2026-05-31Robotics

Robotics
AI summary

The authors developed a robot grasping method that works well not just with new objects and environments, but also with different types of robot hands (grippers) the robot hasn't seen before. They use a special way to describe the shape and movement of the grippers, called a swept-volume heuristic, and trained their model on a huge dataset with many types of grippers and grasps. Their approach performs better than previous methods when tested on new, real-world robot hands without extra training. This method can also be quickly adapted to new grippers with some fine-tuning.

6-DOF graspingcross-embodimentdiffusion modelswept-volume heuristicgripper morphologyprocedural training datazero-shot generalizationrobot graspingfine-tuningsimulation experiments
Authors
Beining Han, Yu-Wei Chao, Erwin Coumans, Clemens Eppner, Balakumar Sundaralingam, Jia Deng, Stan Birchfield, Adithyavairavan Murali
Abstract
We study cross-embodiment 6-DOF robot grasping. Unlike prior works, we require the model not only to generalize to novel objects / scenes but also to novel gripper morphologies and physical grasping processes. Our method extends diffusion model based generative 6-DOF grasping models to condition on the additional gripper's representation. We propose a swept-volume heuristic for encoding the gripper. We train our cross-embodiment model with procedural grippers and a large-scale dataset of 2 Billion grasps. In simulation experiments, our model has the best zero-shot generalization to novel real-world grippers and objects over baseline methods. Our model also serves as a good initialization for fine-tuning to adapt to novel grippers. In ablations, we demonstrate the efficiency of our sweep-volume gripper representation and our procedural gripper training dataset. Last, we show zero-shot generalization to real-world novel grippers for 6-DOF grasping, surpassing baselines in cross-embodiment generalization.