NewtPhys: Do Foundation Models Understand Newtonian Physics?

2026-06-02Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors created a new dataset called NewtPhys that uses real-world videos and physics simulations to better test how well computer models understand basic physics, like forces and motion. Unlike previous tests that used simple or fake scenes, NewtPhys offers detailed, step-by-step information about physical properties in 3D. They used this dataset to check 56 different vision and language models and found that many struggle with true low-level physics reasoning. This work aims to help improve future computer vision systems that need a strong grasp of physical principles.

foundation modelsvisual question-answeringNewtonian physicsmultiview imagesphysics simulations3D forcesvision-language modelsphysics reasoningdataset benchmarking
Authors
Sebastian Cavada, Soumava Paul, Tuan-Hung Vu, Andrei Bursuc, Raoul de Charette
Abstract
Previous work has evaluated physics reasoning in foundation models using synthetic or semi-synthetic scenes and visual question-answering tasks. However, these benchmarks emphasize high-level events and lack the visual fidelity required to assess true low-level Newtonian understanding. We introduce NewtPhys, a 4D physically annotated dataset built from multiview images of real-world scenes with physics-grounded simulations. The dataset provides dense, fine-grained annotations across timesteps -- including 3D forces and amodal per-pixel quantities covering physics, tracking, semantics and geometry -- bridging the gap between simplistic synthetic setups and realistic visual complexity. Using NewtPhys, we systematically evaluate 56 VLMs, including 54 open-weight models and 2 closed-source frontier models, and 10 VFMs and reveal limitations in low-level physics reasoning. Beyond benchmarking, our dataset enables future research in physics-grounded vision and the development of next-generation physics-aware evaluations. Code and datasets are available at https://astra-vision.github.io/NewtPhys.