Spatio-Temporal Correlation Guided Geometric Partitioning for Versatile Video Coding

2026-06-01Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors propose a new method called spatio-temporal correlation guided geometric partitioning (STGEO) to improve video compression in Versatile Video Coding (VVC). Their method predicts patterns in how video blocks are partitioned and moves, reducing the number of bits needed to describe these side details. By using data about edges and past block information, they make the prediction more efficient. Tests show their approach saves around 1-2% in bit rate compared to previous methods without geometric partitioning.

Video CodingVersatile Video Coding (VVC)Geometric Partitioning (GEO)Motion VectorSpatio-Temporal CorrelationEntropy CodingBit-rate SavingPartitioning ModeMerge Candidate ListMotion Field
Authors
Xuewei Meng, Chuanmin Jia, Xinfeng Zhang, Shanshe Wang, Siwei Ma
Abstract
Geometric partitioning has attracted increasing attention by its remarkable motion field description capability in the hybrid video coding framework. However, the existing geometric partitioning (GEO) scheme in Versatile Video Coding (VVC) causes a non-negligible burden for signaling the side information. Consequently, the coding efficiency is limited. In view of this, we propose a spatio-temporal correlation guided geometric partitioning (STGEO) scheme to efficiently describe the object information in the motion field of video coding. The proposed method can economize the bits consumed for side information signaling, including the partitioning mode and motion information. We firstly analyze the characteristics of partitioning mode decision and motion vector selection in a statistically-sound way. Based on the observed spatio-temporal correlation, we design a mode prediction and coding method to reduce the overhead for representing the above mentioned side information. The main idea is to predict the STGEO modes and motion candidates that have higher selection possibilities, which can guide the entropy coding, i.e., representing the predicted high-probability modes and motion candidates with fewer bits. In particular, the high-probability STGEO modes are predicted based on the edge information and history modes of adjacent STGEO-coded blocks. The corresponding motion information is represented by the index in a merge candidate list, which is adaptively inferred based on the off-line trained merge candidate selection probability. Simulation results show that the proposed approach achieves 0.95% and 1.98% bit-rate savings on average compared to VTM-8.0 without GEO for Random Access and Low-Delay B configurations, respectively.