Customization under Fire: Plugin Poisoning in Text-to-Image Ecosystem
2026-06-08 • Cryptography and Security
Cryptography and Security
AI summaryⓘ
The authors study a new security problem with LoRA plugins used in text-to-image AI models, which allow users to easily share and customize model features. They show that attackers can hide harmful behaviors inside these plugins to secretly influence images or create inappropriate content, which then spreads as people remix and share the plugins. This makes the collaborative sharing ecosystem risky, as malicious plugins are hard to detect and remain active across different models and uses. Their experiments demonstrate that these attacks are very effective and stealthy. The work highlights potential dangers in the model-sharing community that were previously overlooked.
text-to-image (T2I) modelsLow-Rank Adaptation (LoRA)plugin poisoningconcept hijackingtask injectionmodel-sharing ecosystempayload propagationattack success rate (ASR)CivitaiLiblib
Authors
Jiahao Chen, Xing He, Yong Yang, Xinfeng Li, Chunyi Zhou, Junhao Li, Zhe Ma, Tianyu Du, Shouling Ji
Abstract
The prosperity of text-to-image (T2I) models has fostered a vibrant share-and-play ecosystem centered on Low-Rank Adaptation (LoRA) plugins, which allow users to customize and share model capabilities with ease. This democratization, however, comes with a hidden but severe security risk. Malicious users could share and distribute seemingly benign LoRA plugins that contain hidden functionalities to poison the model-sharing market, like Civitai or Liblib, severely undermining the user trust that underpins this collaborative ecosystem and threatening the safety of countless downstream applications. Despite these risks, plugin poisoning in the real-world T2I ecosystem remains underexplored. This paper introduces PoisonLoRA, the first systematic study of LoRA plugin supply-chain risks that exploits the trust and characteristics within the T2I ecosystem. We identify two primary attack instances: (1) Concept Hijacking, where a hijacked LoRA could generate images to influence public opinion and spread propaganda, and (2) Task Injection, where a LoRA is injected to produce harmful content (e.g., NSFW images) only activated by a secret key. Critically, the malicious payload persists with virus-like propagation. Such propagations weaponize the very act of creative collaboration (e.g., LoRA merging) to spread its contagion, turning every remix into a new carrier. Extensive experiments validate that PoisonLoRA is both effective and stealthy. Specifically, we achieve approximately 100% attack success rates (ASR) on both Civitai and Liblib on 6 datasets across 4 scenarios, without being detected by the platforms. The poisoned LoRA demonstrates extreme robustness, with nearly 100% ASR even transferred to different base models and remixed more than 5 times.