Streaming Video Generation with Streaming Force Control

2026-06-05Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors present StreamForce, a new method for generating videos that respond realistically to changing forces over time. Unlike previous methods that use separate models or fixed forces, their approach uses a single model that reacts instantly and smoothly to different types of forces. They represent forces in a unified way and train the model to handle these controls effectively while keeping the video looking natural and consistent. Their system runs efficiently on a GPU and achieves strong results in producing believable motion influenced by forces.

streaming video generationphysically grounded controlforce inputsautoregressive modelcausal processingvideo synthesisforce representationdistillation pipelinephotometric realismmotion realism
Authors
Hanhui Wang, Yiming Xie, Haiwen Feng, Zhaoyang Lv, Shenlong Wang, Huaizu Jiang
Abstract
We introduce StreamForce, a streaming video generation framework that enables physically grounded control through continuous force inputs. Unlike prior video models that train separate models for different force types, assume fixed forces, or rely on non-causal processing, StreamForce is a causal and unified model that responds instantly and coherently to both local and global, time-varying forces. To achieve this, we design a unified force representation as a control signal and develop a distillation pipeline for force-controllable video generation. Our model combines autoregressive efficiency with force responsiveness, sustaining stable photometric and dynamic realism. StreamForce runs at up to 16.6 FPS on a single GPU, achieving state-of-the-art performance in both force adherence and motion realism. Project website: https://neu-vi.github.io/StreamForce/