Model Poisoning Against Federated Model Adaptation with Chain of Bit-Flips

2026-06-08Cryptography and Security

Cryptography and SecurityArtificial Intelligence
AI summary

The authors explore a new kind of backdoor attack in Federated Learning (FL), where models are trained across many devices without sharing raw data. Their attack uses hardware faults, like bit-flips caused by Rowhammer, to poison a single model during training and implant a backdoor that works across different FL rounds. They show that only a small number of induced faults can make the attack very effective, especially on common models like ResNet-18. The authors also consider how practical and robust this attack is, and discuss possible defenses against it.

Federated LearningBackdoor AttackModel PoisoningHardware FaultsRowhammerBit-flipsResNet-18Neural NetworksCybersecurityModel Adaptation
Authors
Bastien Vuillod, Kevin Hector, Pierre-Alain Moellic, Jean-Max Dutertre, Olivier Potin
Abstract
Federated Learning (FL) allows a set of clients to collectively train a global model without sharing local training data. Giving the responsibility of the training to decentralized actors may lead to poisoning attacks: clients controlled by malicious third party potentially poison the training dataset to install a backdoor in neural networks. In FL, these backdoor attacks rely solely on algorithmic approach, however, recent advances in hardware faults threats (e.g, Rowhammer) have widen the overall attack surface. In the context of federated model adaptation, we introduce a novel category of backdoor attack against FL systems that relies on model poisoning based on hardware-fault attacks. More precisely, we propose a task-agnostic backdoor attack that is implanted during the FL training time by inducing hardware faults (bit-flips) in parameters of a single local model. The backdoor is crafted during a previous offline phase from the pretrained model initially used by the FL system. Our results show that a backdoor can be successfully applied on different type of models and datasets. Typically, with up to 10 faults per malicious client occurrence and 19 total occurrences on a ResNet-18 are enough to reach 94% of attack success rate. Finally, we discuss the practicality and the robustness of the attack potential defenses, while putting into perspective the practical constraints of Rowhammer, which is the preferred attack vector for this type of threats.