Clinically Grounded Privacy Evaluation of Medical LMs
2026-06-08 • Computation and Language
Computation and LanguageCryptography and Security
AI summaryⓘ
The authors studied how medical language models might accidentally reveal private patient information. They created a careful testing method that looks at different levels of what an attacker could know, from basic patient details to parts of medical notes. Testing a model trained on a lot of clinical notes, they found that common patient information can lead to the model repeating exact private details and revealing sensitive diagnoses like HIV. However, the authors also noted that some repeated text comes from standard templates, not unique data. Their work helps understand privacy risks in medical AI by offering a realistic way to check what information might leak.
medical language modelsprotected health informationprivacy evaluationverbatim memorizationsemantic leakageclinical notesadversarial accesssensitive diagnosestemplated documentationlongitudinal clinical data
Authors
Sasha Ronaghi, Sana Tonekaboni, Lena Stempfle, Vivian Utti, Jordan Li Cahoon, Nathaniel Hendrix, Ayin Vala, Marzyeh Ghassemi, Emily Alsentzer
Abstract
Medical language models (LMs) can memorize and reproduce protected health information, but privacy evaluations often focus on recovery of training text rather than disclosure under realistic threat models. We introduce a clinically grounded framework that evaluates leakage along a graded axis of adversarial access, ranging from publicly inferable demographics to leaked note fragments. At each tier, we measure verbatim memorization of patient-specific text and semantic leakage of sensitive diagnoses. Applying the framework to an LM pretrained on 378k clinical notes, we find that routine encounter metadata (i.e. name, date of birth, provider, practice, visit date) elicits high rates of verbatim memorization across a patient's timeline and sensitive-diagnosis recovery (AUROC 0.91 for abortion, 0.81 for HIV). At the same time, exact-match memorization can overstate disclosure: 36% of memorized tokens reflect templated documentation. Our work highlights the risks of training on longitudinal clinical data, providing a practical framework for contextual privacy evaluation of medical LMs.