Physics-Guided Spatiotemporal State Space Modeling for Lookahead Molten Pool Segmentation in Laser Wire-Feed Welding

2026-06-22 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence

AI summaryⓘ

The authors developed a new model called WeldMamba to predict the future shape and position of important parts in laser wire-feed welding, like the keyhole, wire, and molten pool. They use past images and sensor data to forecast what the welding area will look like half a second ahead, helping with real-time control despite system delays. Their approach combines several techniques to improve accuracy, such as encoding visual and process data, modeling changes over time, and focusing on keyhole movement. Tests showed that their method is effective in predicting weld-pool regions with good precision.

laser wire-feed weldingweld-pool segmentationspatiotemporal modelingkeyholestate space networkssemantic segmentationmotion-aware modelingimage encodingtemporal consistencysigned-distance-function

Authors

Sen Li, Haichao Cui, Changhao Yin, Chendong Shao, Yaqi Wang, Xinhua Tang, Fenggui Lu

Abstract

Real-time weld-pool perception is critical for closed-loop control in laser wire-feed welding, where sensing, computation, and actuator response introduce unavoidable delay. This paper presents a physics-guided spatiotemporal state space network for lookahead weld-pool segmentation. The model uses historical coaxial grayscale images, welding process parameters, and aligned wire-state electrical signals to predict the future semantic layout of three physically meaningful regions: keyhole, wire, and molten pool. It combines a visual encoder, process- and sensor-conditioned feature normalization, patch-level temporal state space modeling, horizon-conditioned latent prediction, dense future feature prediction, and a motion-aware mask decoder. Auxiliary signed-distance-function supervision, temporal consistency, feature distillation, and fine-grained keyhole losses further constrain the predicted geometry and local motion. Experiments on a 43-sequence laser welding dataset show that the proposed WeldMamba reaches 74.63\% mIoU at a 500 ms lookahead. Ablation studies further show that temporal history, patch-level state space modeling, and keyhole motion awareness are the main contributors to robust future segmentation.

View PDFOpen arXiv