TRACE: A Concept Bottleneck Model for Longitudinal 3D Glioblastoma Response Assessment

2026-06-29 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionMachine Learning

AI summaryⓘ

The authors developed TRACE, a computer model that helps doctors track how glioblastoma tumors change over time using MRI scans. Instead of just giving a simple label, TRACE breaks down the tumor changes into understandable clinical concepts based on RANO criteria. This method allows doctors to check and adjust the measurements, making the results more transparent. Their tests show that TRACE performs comparably to other AI methods but adds clarity, though more data and validation are needed.

GlioblastomaMRIRANO criteriaDeep learningConcept bottleneck modelLongitudinal assessmentTumor responseMultimodal imagingCross-validationInterpretable AI

Authors

Alia Tarek, Hamsa Saberr, Hamza Elghonemy, Youssef Afify, Tamer Basha, Omair Shahzad Bhatti, Abdulrahman M. Selim, Hasan Md Tusfiqur Alam Daniel Sonntag

Abstract

Longitudinal glioblastoma response assessment requires comparing subtle tumor changes across MRI time points using structured clinical criteria such as RANO. However, most deep learning methods predict response labels directly from imaging features, which limits clinical inspection, verification, and correction. We introduce TRACE, a RANO 2.0-aligned concept bottleneck model for interpretable 4-class glioblastoma response classification on longitudinal 3D MRI. TRACE processes paired baseline and follow-up multimodal MRI scans with a shared 3D vision encoder, predicts clinically meaningful tumor measurements as root concepts, computes downstream RANO-derived concepts through deterministic rules, and incorporates scan interval and new-lesion information as passthrough concepts. This design frames response assessment as structured concept reasoning rather than direct image-to-label prediction. Using 5-fold patient-wise cross-validation on the LUMIERE dataset, TRACE achieves a 4-class macro F1 of 0.4769 and a binary progression-versus-non-progression macro F1 of 0.7085. It improves over a concept bottleneck baseline and remains within the range of published non-interpretable deep learning approaches. Ablation studies show that the expert RANO graph and intervention-consistency training are important for performance, while intervention experiments demonstrate that correcting concepts can improve downstream predictions. These results suggest that structured concept bottlenecks offer a transparent and clinically aligned direction for longitudinal glioblastoma response assessment, while highlighting the need for larger protocol-aligned datasets and external validation.

View PDFOpen arXiv