DP-LAC: Lightweight Adaptive Clipping for Differentially Private Federated Fine-tuning of Language Models

2026-05-11 • Machine Learning

Machine LearningArtificial IntelligenceCryptography and SecurityDistributed, Parallel, and Cluster Computing

AI summaryⓘ

The authors study how to train big language models on many devices without sharing private data, a method called federated learning. They focus on making this process more private by controlling how much each device's data can influence the training (using a method called DP-SGD). They introduce DP-LAC, which smartly picks and adjusts a key setting (the clipping threshold) without extra privacy costs or complicated tuning. Their tests show DP-LAC improves accuracy compared to older methods while keeping data safer.

Federated LearningLarge-scale Language ModelsDifferential PrivacyStochastic Gradient DescentClipping ThresholdAdaptive ClippingPrivacy BudgetGradientPrivate Histogram Estimation

Authors

Haaris Mehmood, Jie Xu, Karthikeyan Saravanan, Rogier Van Dalen, Mete Ozay

Abstract

Federated learning (FL) enables the collaborative training of large-scale language models (LLMs) across edge devices while keeping user data on-device. However, FL still exposes sensitive information through client-provided gradients. Differentially private stochastic gradient descent (DP-SGD) mitigates this risk by clipping each client's contribution to a threshold $C$ and adding noise proportional to $C$. Existing adaptive clipping techniques dynamically adjust $C$ but demand tedious hyperparameter tuning, which can erode the privacy budget. In this paper, we introduce DP-LAC, a method that first estimates an initial clipping threshold within an order of magnitude of the optimum using private histogram estimation, and then adapts this threshold during training without consuming additional privacy budget or introducing new hyperparameters. Empirical results show that DP-LAC outperforms both state-of-the-art adaptive clipping methods and vanilla DP-SGD, achieving an average accuracy gain of $6.6\%$.

View PDFOpen arXiv