Privacy Vulnerabilities of Attention Layers in Tabular Foundation Models and Protection of High-Risk Queries

2026-06-24Cryptography and Security

Cryptography and SecurityArtificial Intelligence
AI summary

The authors show that tabular foundation models, which learn from example data given during use, can unintentionally reveal whether certain records were part of their training data. They introduce AMIA, a new way to guess membership by looking at how the model pays attention internally, which works better than older methods. To reduce this privacy risk, they develop a defense that masks unique parts of the input without hurting the model’s accuracy much. They also find that fine-tuning these models can increase privacy leakage by making some samples easier to identify as training data.

tabular foundation modelsin-context learningmembership inference attacksattention mechanismshadow modelconfidence-based attacksk-anonymityfine-tuningmemorisationpredictive utility
Authors
Tânia Carvalho, Maxime Cordy
Abstract
Tabular foundation models are commonly assumed to present limited privacy concerns as they are often pre-trained on large collections of synthetic data. However, these models leverage in-context learning, where sensitive records may be provided directly at inference time as labelled context examples. In this paper, we demonstrate that predictions generated via the attention mechanism leak sufficient information to enable effective Membership Inference Attacks (MIAs). To highlight this vulnerability, we propose AMIA (Attention-based Membership Inference Attack), a shadow-model-free attack that exploits the concentration of transformer attention patterns. Our results show that attention mechanisms reveal strong membership signals, which exceed classical confidence-based attacks, achieving an average gain of 7.7\%, specially in low false-positive regimes. To mitigate this risk, we introduce an inference-time defence inspired by $k$-anonymity principles. This approach reduces the uniqueness of context-key representations without introducing random noise or retraining the model. By targeting only high-risk queries identified through AMIA scores, the defence substantially reduces membership leakage of this attack by an average of 50\% and 25\% against confidence-based attacks, while preserving predictive utility with only 3.9\% performance degradation. Beyond showing that context examples are vulnerable, we further demonstrate that fine-tuning introduces an additional source of privacy risk. In particular, samples whose prediction confidence increases after fine-tuning become more susceptible to MIAs, indicating that fine-tuning can amplify memorisation and expose sensitive training information through confidence shifts.