CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

2026-05-25 • Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence

AI summaryⓘ

The authors address the problem of managing many customized image effects stored as separate LoRA modules for diffusion models, which increases deployment complexity and causes quality issues when combined. They introduce CollectionLoRA, a method that merges up to 50 different effect LoRAs into a single one while preserving fast generation and good quality. Their approach uses new training strategies to help the model switch between effects and keep them separate without interference. Tests show their method reduces deployment costs and maintains effect quality similar to or better than individual models.

Diffusion ModelsLow-Rank Adaptation (LoRA)Image EditingDistillationPrompt EngineeringParameter InterferenceMulti-Teacher LearningFew-Step GenerationConcept BleedingOrthogonal Prompting

Authors

Fangtai Wu, Hailong Guo, Shijie Huang, Jiayi Song, Yubo Huang, Mushui Liu, Zhao Wang, Yunlong Yu, Jiaming Liu, Ruihua Huang

Abstract

Customized image editing aims to equip pre-trained diffusion models with specific visual effects using limited paired data, typically via Low-Rank Adaptation (LoRA). As the number of desired effects grows, storing and dynamically loading numerous these effect LoRAs significantly increases deployment overhead. Furthermore, current pipelines typically cascade these effect LoRAs with acceleration modules for fast generation, which triggers severe parameter interference and results in concept bleeding and style degradation. We propose CollectionLoRA, a multi-teacher on-policy distillation framework capable of distilling the concepts of up to 50 different effect LoRAs along with few-step generation capabilities into a single LoRA. This fundamentally resolves the feature interference issue and significantly reduces deployment costs. Specifically, the method introduces (i) a Probabilistic Dual-Stream Routing mechanism that enables the model to randomly switch between data sources during training, effectively enhancing its generalization in unseen scenarios; (ii) an Asymmetric Orthogonal Prompting strategy to achieve concept isolation within the prompt space; (iii) a Coarse-to-Fine Distillation Objective to mitigate the distribution gap between the teacher and student models. Extensive evaluations show that CollectionLoRA distills all customized effects and few-step generation into a single LoRA, reducing deployment overhead while achieving concept fidelity comparable to or better than independently trained teacher models.

View PDFOpen arXiv