Benchmarking Deep Learning for Future Liver Remnant Segmentation in Colorectal Liver Metastasis

2026-04-09Machine Learning

Machine Learning
AI summary

The authors improved a dataset used to identify parts of the liver that stay after surgery for liver cancer from colorectal origins. They carefully corrected 197 CT scan images to create a reliable standard dataset for future research. Then, they tested different AI methods to see how well they could locate liver pieces, tumors, and future liver remnants. They found that a step-by-step method worked best for finding the future liver remnant, while another AI was better at spotting tumors and was more reliable. This work sets a useful foundation for developing AI tools that can help doctors plan liver surgeries.

liver segmentationfuture liver remnant (FLR)colorectal liver metastases (CRLM)CT imagingnnU-NetSwinUNETRSTU-Netmedical image segmentationDice scorecascaded segmentation
Authors
Anthony T. Wu, Arghavan Rezvani, Kela Liu, Roozbeh Houshyar, Pooya Khosravi, Whitney Li, Xiaohui Xie
Abstract
Accurate segmentation of the future liver remnant (FLR) is critical for surgical planning in colorectal liver metastases (CRLM) to prevent fatal post-hepatectomy liver failure. However, this segmentation task is technically challenging due to complex resection boundaries, convoluted hepatic vasculature and diffuse metastatic lesions. A primary bottleneck in developing automated AI tools has been the lack of high-fidelity, validated data. We address this gap by manually refining all 197 volumes from the public CRLM-CT-Seg dataset, creating the first open-source, validated benchmark for this task. We then establish the first segmentation baselines, comparing cascaded (Liver->CRLM->FLR) and end-to-end (E2E) strategies using nnU-Net, SwinUNETR, and STU-Net. We find a cascaded nnU-Net achieves the best final FLR segmentation Dice (0.767), while the pretrained STU-Net provides superior CRLM segmentation (0.620 Dice) and is significantly more robust to cascaded errors. This work provides the first validated benchmark and a reproducible framework to accelerate research in AI-assisted surgical planning.