SPARK: Security Knowledge Priming and Representation-Guided Knowledge Activation for LLM-based Secure Code Generation

2026-06-15 • Cryptography and Security

Cryptography and SecurityArtificial Intelligence

AI summaryⓘ

The authors found that big language models already know about code security problems but usually don't show this knowledge unless given a clear hint. They created SPARK, a tool that helps models reveal their existing security awareness during code generation without retraining. SPARK works by adding a small security-related prompt and adjusting model predictions to favor safer code. Their tests showed SPARK performs as well or better than other methods and works on many popular programming languages and strong language models.

Large language modelsCode generationSecurity vulnerabilitiesPretrainingCommon Weakness Enumeration (CWE)Inference-time interventionToken biasHidden statesFine-tuningPrompt engineering

Authors

Xiaoyun Xu, Lichao Wu, Jona te Lintelo, Siyu Zhang, Stjepan Picek

Abstract

Large language models routinely generate code with exploitable security flaws. Prior literature attributes this limitation to a lack of security expertise, steering current defense mechanisms toward heavy fine-tuning or external knowledge retrieval, which introduces significant computational overhead and data bias through redundant code examples. Contrary to this view, we argue that pretraining corpora are already rich in security material. The bottleneck is activation: without an explicit and brief cue, statistical pressure toward common training-distribution patterns suppresses the model's safety-relevant representations. We present SPARK, an inference-time security harness that activates this latent knowledge without any retraining. The harness has two parts. Component~I retrieves a few of the relevant Common Weakness Enumeration (CWE) entries for each coding task and appends a short structured cue to the prompt; this alone is enough to surface the model's existing security representations. Component~II adds a precomputed token bias to the logits at every decoding step. We obtain the bias by projecting a safe-direction vector, the unit difference between the mean safe and mean unsafe last-layer hidden states, through the language model head. The bias is computed once offline; applying it costs a single vector addition per generated token. We evaluate SPARK on 9 open-source models across C++, Java, and Python, and compare with 7 baselines spanning fine-tuning and retrieval-augmented methods. SPARK matches or improves on the best baseline in every setting while preserving HumanEval utility. We further test Component~I in a black-box setting on 7 of today's strongest models, including Claude, DeepSeek, and GPT, demonstrating the bottleneck of insecure code generation and the improvements enabled by our method.

View PDFOpen arXiv