On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

2026-06-01 • Machine Learning

Machine LearningComputation and Language

AI summaryⓘ

The authors look at small add-ons called adapters that can be added to large pre-trained models. Instead of just being a cheaper way to customize these models, the adapters act like local memories or preferences that help the model handle specific tasks or users better. They explore how making these adapters bigger, smaller, or having many of them together affects performance. The authors also introduce MinT, a system to manage these adapters easily. Overall, they suggest that these adapters could serve as personalized models stored on top of bigger shared models.

Parameter-efficient fine-tuningAdaptersFoundation modelsModel personalizationModel scalingMinT infrastructurePersistent local stateFine-tuningShared priorsModel customization

Authors

Mind Lab, :, Song Cao, Vic Cao, Kaijie Chen, Bunny Fan, Hera Feng, Huan Feng, Arthur Fu, Jun Gao, Hongquan Gu, Aaron Guan, Mutian Hong, Hailee Hou, Peixuan Hua, Charles Huang, Miles Jiang, Nora Jiang, Yuyi Jiang, Autumn Jin, Fancy Kong, Kyrie Lei, Alexy Li, Dawn Li, Ray Li, Theo Li, Wenhao Li, Jiayi Lin, Domini Liu, Heshan Liu, Kairus Liu, Logan Liu, Maeve Luo, Runism Lv, Pony Ma, Verity Niu, Anson Qiu, Vincent Wang, Maxwell Yao, Regis Ye, Wenlin Ye, Yanying Ye, Josh Ying, Danney Zeng, Salmon Zhan, Anya Zhang, Ruijia Zhang, Shiyang Zhang, Sueky Zhang, Ya Zhang, Wei Zhao, Ada Zhou, Sizer Zhou, Xinyue Zhu, Murphy Zhuang

Abstract

Parameter-efficient fine-tuning (PEFT) is usually treated as a cheaper alternative to full fine-tuning. We study a broader role: small trainable adapters as persistent local state on top of strong shared foundation models. In this framing, the base model provides shared competence while adapters carry instance-specific behavior such as preferences, skills, tool habits, and memory-like updates. We organize the problem around three scaling axes: Scale Up, where stronger shared priors make small local updates more useful; Scale Down, where we study how small adapters can be while remaining reliable; and Scale Out, where many persistent adapted instances coexist. MinT provides one infrastructure example for managing adapter identity, revision, provenance, evaluation, and serving residency. Together, the results suggest that PEFT can be a compact substrate for persistent personal models rather than only a budget substitute for full fine-tuning.

View PDFOpen arXiv