From Reconstruction to Decision: A Post-Encoder Plug-in Adapter for Curvilinear Segmentation

2026-06-22Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors address a problem with segmenting thin, curvy objects like vessels and cracks that are easy to break apart when errors happen. They find that current systems struggle to reconstruct fine details after initial processing and making final binary decisions. To fix this, they create PEPA, a lightweight add-on that improves detail recovery and threshold decisions without changing the main encoder. PEPA uses two techniques to better trace thin structures and choose adaptive cutoffs, improving how connected the segmented shapes are. Tests on medical and industrial data show PEPA helps existing models work better with minimal extra complexity.

Curvilinear segmentationTopological continuityPost-encoderUpsamplingDifferentiable thresholdingBinarizationFrozen encoderclDiceIoUDecoder
Authors
Qin Lei, Jiang Zhong, Xin Xiao, Yuming Yang, Hao Wu
Abstract
Curvilinear object segmentation, including vessels and cracks, is challenging due to extreme spatial sparsity and topological fragility, where small local errors can cause severe structural disconnections. Meanwhile, modern segmentation pipelines increasingly rely on strong but hard-to-modify foundation encoders whose heavy downsampling limits fine structural recovery. Motivated by this, we focus on the post-encoder stage and study two recurring and actionable failure modes: a reconstruction bottleneck in high-resolution feature restoration and a decision bottleneck in binarization. We present PEPA, a lightweight Post-Encoder Plug-in Adapter for 2D curvilinear segmentation pipelines with accessible decoder/head features and target, query, or class descriptors. PEPA couples (i) Target-Conditioned Snake Upsampling (TCSU), which uses target-conditioned continuous snake-like sampling to better recover thin and tortuous structures during upsampling, and (ii) Target-Adaptive Differentiable Thresholding (TADT), which predicts target-specific thresholds and optimizes a soft-threshold surrogate with explicit safeguards against trivial bias shifting. Under this post-encoder interface, PEPA can be attached to both prompt-based decoders and conventional dense predictors. Experiments on five medical and industrial benchmarks show that adding PEPA to frozen-encoder baselines yields consistent improvements, with gains in topological connectivity (clDice) typically exceeding those in region overlap (IoU), indicating improved structural continuity. With only $\sim$0.26M additional parameters, PEPA offers a practical post-encoder enhancement for structure-centric segmentation.