Structure-aware Knowledge-guided Heterogeneous Mamba for Zygomaticomaxillary Suture Assessment
2026-06-15 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors created the first large public collection of images showing a specific skull joint called the Zygomaticomaxillary Suture (ZMS), which is important for certain dental treatments. They developed a new computer method named SKMamba that mimics how experts look at these images by using two pathways: one to highlight the joint edges and another to use language descriptions to better understand the images. This helps the system decide how mature the joint is more accurately than previous methods. Their approach was tested and performed very well on their new dataset.
Zygomaticomaxillary Suturemaxillary advancementorthopedic interventionssutural maturationmorphological assessmentmachine learningmulti-modal frameworklarge language modelimage pre-training
Authors
Xiaoqi Guo, Birui Chen, Xinquan Yang, Chaoyun Zhang, Xuefen Liu, Mianjie Zheng, Kun Tang, Xuguang Li, Wen Ma, Yanhua Xu, Linlin Shen
Abstract
The Zygomaticomaxillary Suture is a key circummaxillary structure that connects the zygomatic bone and the maxilla, which serves as a primary site of resistance during maxillary advancement, and its maturation status directly influences the timing and efficacy of orthopedic interventions. However, accurate staging of ZMS maturation remains challenging due to subtle high-frequency transitions in suture lines and the global semantic ambiguity between adjacent stages. To address this, we present the first public ZMS dataset, comprising 3,790 ZMS images covering the entire age range from 4 to 24 years. Based on this dataset, we propose SKMamba, a Structure-aware and Knowledge-guided Mamba-based multi-modal framework for automated ZMS maturation assessment. SKMamba adopts a decoupled dual-path architecture that mimics the hierarchical diagnostic process used by experienced orthodontists. We first introduce an Implicit Edge Extractor (IEE), which leverages structural pre-training to reduce trabecular noise and accentuate sutural boundaries. Complementarily, a Cross-Modal Semantic Alignment (CSA) module is designed to incorporate anatomical descriptions from a large language model (LLM). This module helps align local morphological cues with global semantic descriptions while ensuring that objective morphological evidence remains the primary basis for decisions. Extensive experiments on our ZMS dataset demonstrate that SKMamba achieves state-of-the-art performance compared to existing methods. Code is available at https://github.com/galaxygxq1116/SKMamba.