Towards in-the-wild Egocentric 3D Hand-Object Pose Estimation
2026-06-29 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors created EPIC-Contact, a new dataset with videos showing hands interacting with objects from a first-person view, including detailed 3D information about how hands and objects touch. They also developed HOPformer, a transformer-based model that predicts the 3D positions of both hands and objects together in one step, using the relationship between hands and objects to improve accuracy. Their method works better than previous ones on both a lab dataset and their new real-world dataset, showing much higher success rates and more precise contact estimation. They have made their dataset and code publicly available for others to use.
3D hand-object pose estimationegocentric RGBtransformer modelcross-attentioncontact correspondencesbi-manual posein-the-wild datasetmesh posingpose generalizationARCTIC dataset
Authors
Siddhant Bansal, Zhifan Zhu, Shashank Tripathi, Jiahe Zhao, Michael J. Black, Dima Damen
Abstract
Estimating accurate 3D hand-object pose from in-the-wild egocentric RGB remains challenging due to severe occlusions and ambiguous contact. Existing learning-based methods often struggle to generalise to in-the-wild scenes and are limited by the scarcity of supervision. We address these issues with two contributions. First, we introduce EPIC-Contact, an in-the-wild egocentric dataset of 2.3K clips (62.3K frames) with dense, bijective 3D hand-object contact correspondences and posed meshes. Second, we propose HOPformer, an end-to-end transformer that jointly predicts bi-manual hand and object pose in a single forward pass. A cross-attention decoder conditions object features on hand priors, producing robust pose estimation. We test HOPformer on the in-lab 3D dataset, ARCTIC, as well as our newly introduced EPIC-Contact dataset. HOPformer reaches 82.4% success rate on ARCTIC (+6.2 pts over current SOTA). On EPIC-Contact, it nearly doubles the success rate while reducing contact deviation by 75%. EPIC-Contact, HOPformer code and checkpoints are released: https://sid2697.github.io/epic-contact.