Food Noise & False Safety: A Systematic Evaluation of How LLMs Fail to Adapt to Eating Disorder Queries with Clinician Feedback

2026-06-01Artificial Intelligence

Artificial IntelligenceComputation and Language
AI summary

The authors studied how people with eating disorders use chatbots powered by large language models (LLMs) to get advice and support. They found that these models, while helpful, can sometimes respond dangerously when users share harmful or risky requests. By working with experts, the authors identified specific language in user messages that makes unsafe chatbot replies more likely. They also tested how changes in user input risk levels affect the chatbot's responses, highlighting potential dangers in unfiltered interactions.

Eating disordersLarge language modelsChatbotsUser interactionClinical adviceSafety risksLinguistic cuesSelf-harmPrompt engineeringEmotional support
Authors
Giulia Pucci, Emily Hemendinger, Ruizhe Li, Gavin Abercrombie, Tanvi Dinkar, Arabella Sinclair
Abstract
Recent evidence shows that people with eating disorders (EDs) are increasingly seeking guidance, advice, and emotional support from Large Language Model (LLM)-based chat systems. Although these systems are not designed to provide clinical advice, their perceived expertise, neutrality and accessibility make them a frequent, albeit risky, source of support. This paper investigates potential patterns of interaction between users with EDs and LLMs, focusing on the potential harms arising from models that uncritically adapt to, and facilitate unsafe or self-harming user requests. We find, in consultation with clinical ED experts, that specific linguistic cues in prompts increase the likelihood of unsafe responses and, through systematically varying the degree of potential risk present in the user prompt, report the extent to which LLMs uncritically adapt to problematic, and potentially dangerous user inputs.