Beyond Loss Values: Robust Dynamic Pruning via Loss Trajectory Alignment
2026-04-08 • Computer Vision and Pattern Recognition
Computer Vision and Pattern RecognitionMachine Learning
AI summaryⓘ
The authors found that current methods for removing less important data points during training struggle when the labels have mistakes, because they tend to keep the wrong data thinking it's important. They created AlignPrune, a new tool that looks at how the error changes over time to better spot and remove noisy data. This tool can be added to existing methods without changing the model or training process and helps improve accuracy across multiple tests by a noticeable margin. Their work offers a reliable way to handle messy real-world data during model training.
dynamic data pruningnoisy labelsloss trajectoryDynamic Alignment Scoremodel trainingdata pruningnoise robustnessplug-and-play modulemachine learning benchmarks
Authors
Huaiyuan Qin, Muli Yang, Gabriel James Goenawan, Kai Wang, Zheng Wang, Peng Hu, Xi Peng, Hongyuan Zhu
Abstract
Existing dynamic data pruning methods often fail under noisy-label settings, as they typically rely on per-sample loss as the ranking criterion. This could mistakenly lead to preserving noisy samples due to their high loss values, resulting in significant performance drop. To address this, we propose AlignPrune, a noise-robust module designed to enhance the reliability of dynamic pruning under label noise. Specifically, AlignPrune introduces the Dynamic Alignment Score (DAS), which is a loss-trajectory-based criterion that enables more accurate identification of noisy samples, thereby improving pruning effectiveness. As a simple yet effective plug-and-play module, AlignPrune can be seamlessly integrated into state-of-the-art dynamic pruning frameworks, consistently outperforming them without modifying either the model architecture or the training pipeline. Extensive experiments on five widely-used benchmarks across various noise types and pruning ratios demonstrate the effectiveness of AlignPrune, boosting accuracy by up to 6.3\% over state-of-the-art baselines. Our results offer a generalizable solution for pruning under noisy data, encouraging further exploration of learning in real-world scenarios. Code is available at: https://github.com/leonqin430/AlignPrune.