ConCent: Contact-Centric Real-to-Sim-to-Real Learning from One Demonstration

2026-06-29Robotics

Robotics
AI summary

The authors focus on helping robots trained in simulation work better in the real world, especially for tasks involving lots of touching and moving things. They found that success depends on matching how and when objects touch each other during the task. To do this, they track the sequence of contact events from real demonstrations and use that as a guide for training in simulation. This way, the robot learns to handle objects with realistic contacts, making it more reliable when used outside simulation. Their approach works better than methods that don’t consider these detailed contact interactions.

sim-to-real transferrobot manipulationcontact dynamicsreinforcement learningcontact event sequencesimulation fidelityreal-world demonstrationspolicy transfercontact geometrystructured reward
Authors
Heecheol Kim, Namiko Saito, Katsushi Ikeuchi, Yasuyuki Matsushita
Abstract
Sim-to-real policy transfer -- deploying policies trained in simulation in the real world -- is a promising paradigm for scaling robot manipulation without large-scale real-world data. However, transferring simulation-trained policies remains challenging due to discrepancies in contact dynamics -- particularly in contact-rich tasks where subtle differences can alter task outcomes entirely. Because interaction between the manipulated object and the environment is mediated through contact, task success depends on accurately reproducing task-relevant contacts. Accordingly, in manipulation, contact-centric fidelity -- reproducing both the contact event sequence (when, where, and how contacts occur) and the local contact dynamics (how forces and motions evolve at each contact) -- is a necessary condition for task success. Based on this insight, we propose a contact-centric real-to-sim-to-real RL framework that uses task-relevant contact event sequences extracted from real demonstrations as the learning objective. We approximate objects as groups of primitives and optimize their contact geometry in simulation so that the resulting local contact dynamics explain the observed state transitions. The contact event sequence is automatically extracted by replaying the demonstration. This sequence serves as a structured reward signal, guiding the policy toward physically plausible contact regimes validated in reality and preventing exploitation of unrealistic simulator contacts. The signal is obtained automatically, requiring no per-task reward design. Experiments on contact-rich manipulation tasks demonstrate more stable and robust sim-to-real policy transfer compared to unconstrained RL baselines.