Cross-Stage Attention Multi-Expert Network for Radiologist-Inspired Breast Ultrasound Diagnosis

2026-05-25 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence

AI summaryⓘ

The authors created a new computer model called CSA-MoE-Net to help doctors tell if breast lumps in ultrasound images are harmless or cancerous. Their model looks at different parts of the tumor and combines this information smartly to focus on important details while ignoring noise. Tested on over 2,000 images, their method showed better accuracy and other measures than a standard model. This approach can be added to other common image models and helps improve breast cancer detection without needing complicated changes.

Breast ultrasound imagingBenign/malignant classificationResNet-18Cross-Stage AttentionMixture of ExpertsAdaptive Gating NetworkFeature representationF1-scoreArea Under Curve (AUC)Computer-aided diagnosis

Authors

Xinyang Zhai, Chong Yang, Ruizhi Zhang

Abstract

Breast ultrasound imaging is an important noninvasive method for early breast cancer diagnosis, but automatic benign/malignant classification remains challenging due to tumor heterogeneity, blurred boundaries, and data imbalance. To improve feature representation and classification accuracy, this paper proposes the Cross-Stage Attention Mixture-of-Experts Network (CSA-MoE-Net). It adopts a Cross-Stage Attention-enhanced ResNet-18 as the backbone, in which the Cross-Stage Attention module adaptively recalibrates multi-level features, thereby enhancing key tumor features and suppressing redundancy. A three-branch Mixture of Experts (MoE) Block learns complementary features from the Whole Tumor Image, Tumor Core, and Boundary, and an Adaptive Gating Network fuses them to capture morphological, textural, and contextual information. The fused features are denoted as Fused Expert Feature (FEF) in the architecture. Experiments on a balanced dataset of 2,129 breast ultrasound images show that, averaged over 20 independent runs, the model achieves an accuracy of 96.33\%, precision of 94.09\%, recall of 98.53\%, F1-score of 96.25\%, and AUC of 99.50\%. Compared to the baseline ResNet-18, these metrics improve by 3.01, 0.70, 5.37, 2.98, and 5.42 percentage points, respectively. The proposed mechanism requires no invasive modification and can be seamlessly embedded into VGG-16, DenseNet-121, etc., yielding stable performance gains, thus providing reliable support for computer-aided diagnosis.

View PDFOpen arXiv