DSVTLA: Deep Swin Vision Transformer-Based Transfer Learning Architecture for Multi-Type Cancer Histopathological Cancer Image Classification

2026-04-10Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors developed a new AI model that combines two methods: a Swin Transformer and ResNet50, to better understand cancer images from different body parts. Their model looks at both the big picture and tiny details in these medical images to classify different cancer types accurately. They tested it on a large and varied set of cancer images, including lungs, breast, and leukemia, and found that their model outperformed other popular AI models. The results showed very high accuracy and consistent performance across different cancers. This work helps improve AI tools for helping doctors diagnose cancer from microscope images.

Swin TransformerResNet50Transfer LearningHistopathological ImagesCancer ClassificationMulti-cancer DatasetComputer VisionConvolutional Neural NetworksFine-tuningPrecision and Recall
Authors
Muazzem Hussain Khan, Tasdid Hasnain, Md. Jamil khan, Ruhul Amin, Md. Shamim Reza, Md. Al Mehedi Hasan, Md Ashad Alam
Abstract
In this study, we proposed a deep Swin-Vision Transformer-based transfer learning architecture for robust multi-cancer histopathological image classification. The proposed framework integrates a hierarchical Swin Transformer with ResNet50-based convolution features extraction, enabling the model to capture both long-range contextual dependencies and fine-grained local morphological patterns within histopathological images. To validate the efficiency of the proposed architecture, an extensive experiment was executed on a comprehensive multi-cancer dataset including Breast Cancer, Oral Cancer, Lung and Colon Cancer, Kidney Cancer, and Acute Lymphocytic Leukemia (ALL), including both original and segmented images were analyzed to assess model robustness across heterogeneous clinical imaging conditions. Our approach is benchmarked alongside several state-of-the-art CNN and transfer models, including DenseNet121, DenseNet201, InceptionV3, ResNet50, EfficientNetB3, multiple ViT variants, and Swin Transformer models. However, all models were trained and validated using a unified pipeline, incorporating balanced data preprocessing, transfer learning, and fine-tuning strategies. The experimental results demonstrated that our proposed architecture consistently gained superior performance, reaching 100% test accuracy for lung-colon cancer, segmented leukemia datasets, and up to 99.23% accuracy for breast cancer classification. The model also achieved near-perfect precision, f1 score, and recall, indicating highly stable scores across divers cancer types. Overall, the proposed model establishes a highly accurate, interpretable, and also robust multi-cancer classification system, demonstrating strong benchmark for future research and provides a unified comparative assessment useful for designing reliable AI-assisted histopathological diagnosis and clinical decision-making.