JOIN: Anchor-Grasp-Conditioned Joining via Opposition, Inference, and Navigation for Bimanual Assistive Manipulation

2026-06-09Robotics

Robotics
AI summary

The authors address the challenge of using two arms for daily tasks like opening jars or pouring liquids when a wheelchair typically only has space for one robotic arm. They propose a system where a fixed wheelchair-mounted arm works together with a second mobile arm that can be brought over when needed. Their method plans how the second arm approaches and grasps objects to complete tasks alongside the first arm. Testing showed their system, called JOIN, was better at finishing tasks with less help than other existing methods.

assistive roboticsbimanual manipulationwheelchair-mounted armmobile manipulatorgrasp planningvision-language modeltask-level knowledgemanipulabilityhuman-robot interactiondaily living activities
Authors
Drake Moore, Matt Cheng, Xiang Zhi Tan, Taşkın Padır
Abstract
Assistive mobility and manipulation platforms have received increasing attention as a means of restoring independence to individuals with disabilities. While effective for many basic activities of daily living (ADLs), a significant percentage of everyday tasks such as opening a jar, pouring a liquid, lifting a tray, or basic meal preparation, is fundamentally bimanual and remains out of reach for any single-arm system. Adding a second arm to a wheelchair is impractical, due to the additional power draw, cost, and the loss of space required for transfers and mobility. We instead propose a heterogeneous, on-demand bimanual system, in which a wheelchair-mounted anchor arm is joined when needed by a summoned mobile manipulator that serves as a complement arm. The central technical problem, which we call bimanual joining, is conditional: the anchor has already committed to a grasp, and the complement arm must choose where to stand and what to grasp to complete the task. We formulate bimanual joining as a three-phase decomposition (plan, drive, grasp) and show that a vision-language model (VLM), coupled with standard geometric tools, provides task-level knowledge sufficient to solve a representative class of bimanual ADLs. Our system JOIN, contributes (i) a wheelchair-referenced opposition score, and (ii) task-conditioned directional manipulability. We evaluate JOIN on a Kinova Gen3 anchor and a Hello Robot Stretch~3 complement on representative same-object and different-object tasks. JOIN accomplished more attempts (19/20) than state-of-the-art methods (14/20) and required markedly less correction by the operator.