PhysScene: A Scene Graph Dataset for Scientific Visual Reasoning in Physics Experiments
2026-06-08 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionArtificial Intelligence
AI summaryⓘ
The authors created a new dataset called PhysScene to help computers understand pictures from physics experiments, focusing on objects and how they relate functionally, not just where things are. Unlike existing datasets that mostly show everyday scenes, PhysScene includes scientific tools and setups with detailed and meaningful connections. This makes it harder but more useful for teaching AI to reason about experiments. Their tests show PhysScene adds a new challenge and complements current resources for visual understanding in science contexts.
Scene GraphsRelational ReasoningPhysics ExperimentsVisual Scene ParsingSemantic ConstraintsDatasetExperimental SetupFunction-oriented RelationsVisual Reasoning
Authors
Minghao Zou, Qingtian Zeng, Shangkun Liu, Yanda Meng, Guanghui Yue, Baoquan Zhao, Abdulmotaleb El Saddik, Wei Zhou
Abstract
Scene Graphs (SGs) provide structured representations of visual scenes by modeling objects and their pairwise relationships. Despite recent progress, existing datasets primarily focus on generic natural contexts, leaving domain-specific and function-oriented scenes largely underexplored. This limitation restricts the evaluation of relational reasoning in scientific experimental scenes, thereby hindering the development of intelligent monitoring, analysis, and related applications in such scenes. To address this gap, we introduce PhysScene, the first SG dataset tailored to physics experiments. PhysScene encompasses specialized instruments, structured experimental setups, and functional relations intrinsic to experimental environments, enabling reasoning that extends beyond spatial co-occurrence to logical dependencies. Rather than pursuing large data scale, PhysScene focuses on strong semantic constraints and high relation density in experimental scenes, posing new challenges for existing scene parsing algorithms while offering opportunities for further improvements. Extensive analyses and experiments show that PhysScene complements existing benchmarks and establishes a valuable testbed for advancing scientific visual reasoning. The dataset is publicly available at https://github.com/ZMH-SDUST/PhysScene.