IMWM: Intuition Models Complement World Models for Latent Planning

2026-06-01 • Machine Learning

Machine Learning

AI summaryⓘ

The authors studied how to improve planning from raw pixel data by combining two models: a world model that predicts future states and an intuition model trained to identify good actions from demonstrations. They found that even perfect world models can struggle if the planner's search is limited. To address this, their IMWM method uses demonstrations to guide planning, mixing intuition and model predictions with three key techniques. Their tests on four tasks showed that IMWM performed better than using the world model alone, especially in more complex tasks like Two-Room and OGBench-Cube.

latent world modelsample-based plannerforward predictorrolloutintuition modeldemonstrationsaction proposalhybrid costgoal-reaching taskspixel-based control

Authors

Baoqi Gao, Ruize Han, Miao Wang, Song Wang

Abstract

Planning with a learned latent world model is a promising route to control from raw pixels, but a strong world model alone is not enough. We show this experimentally: even with a perfect world model (operationalized by replacing the learned forward predictor with an idealized rollout of the true environment dynamics), a finite-budget sample-based planner still fails on some tasks, indicating that the bottleneck can lie in search rather than in world-model accuracy. Motivated by this gap, we propose IMWM (Intuition Model + World Model), which pairs the world model with an intuition model trained from demonstrations to recognize promising actions. The two models collaborate through three lightweight components: (i) Retrieval Initialization, which initializes the planner's action proposal from a retrieved demonstration; (ii) Hybrid Cost, which combines the intuition score with the world-model rollout cost; and (iii) a Reliability Gate, which adjusts how much the planner trusts intuition in each setting. Across four pixel-based goal-reaching tasks (Two-Room, Reacher, Push-T, and OGBench-Cube), IMWM has higher mean success than the world-model-only planner on all four, with the largest gains on Two-Room (99.2%, +11.5 percentage points) and OGBench-Cube (94.7%, +28.5 percentage points).

View PDFOpen arXiv