When Prompts Become Payloads: A Framework for Mitigating SQL Injection Attacks in Large Language Model-Driven Applications
2026-05-11 • Cryptography and Security
Cryptography and SecurityArtificial Intelligence
AI summaryⓘ
The authors focus on making it safer to use large language models (LLMs) that turn natural language into database queries, like SQL. They explain that while these models make it easier to ask questions, they also open the door to new hacking tricks called SQL injection attacks. To fix this, the authors create a security system with multiple layers that checks and cleans user prompts, detects unusual or harmful behavior, and blocks known attack patterns. They test their system with tricky attack examples and show it works well to keep database queries safe without making too many mistakes.
Large Language ModelsSQL InjectionNatural Language InterfacePrompt SanitizationThreat DetectionBehavioral AnomalySemantic AnomalyAdversarial PromptsDatabase SecurityQuery Languages
Authors
Farzad Nourmohammadzadeh Motlagh, Mehrdad Hajizadeh, Mehryar Majd, Pejman Najafi, Feng Cheng, Christoph Meinel
Abstract
Natural language interfaces to structured databases are becoming increasingly common, largely due to advances in large language models (LLMs) that enable users to query data using conversational input rather than formal query languages such as SQL. While this paradigm significantly improves usability and accessibility, it introduces new security risks, particularly the amplification of SQL injection vulnerabilities through the prompt-to-SQL translation process. Malicious users can exploit these mechanisms by crafting adversarial prompts that manipulate model behavior and generate unsafe queries. In this work, we propose a multi-layered security framework designed to detect and mitigate LLM-mediated SQL injection attacks. The framework integrates a front-end security shield for prompt sanitization, an advanced threat detection model for behavioral and semantic anomaly identification, and a signature-based control layer for known attack patterns. We evaluate the proposed framework under diverse and realistic attack scenarios, including prompt injection, obfuscated SQL payloads, and context-manipulation attacks. To ensure robustness, we generate and curate a comprehensive benchmark dataset of adversarial prompts and assess performance across a fine-tuned LLM configuration. Experimental results demonstrate that the proposed approach achieves high detection accuracy while maintaining low false-positive rates, significantly improving the secure deployment of LLM-powered database applications.