Brain-Prompt Injection: A Route-Safety Audit for BCI-LLM Agents

2026-06-08 • Cryptography and Security

Cryptography and SecurityArtificial Intelligence

AI summaryⓘ

The authors study how brain-computer interfaces (BCIs) that control tools can be tricked through subtle signal or context changes, which they call brain-prompt injection. They show that simply detecting decoder errors isn't enough to guarantee safe actions; instead, safety depends on what the audit logs can actually observe. They propose a formal auditing framework with a contract, theory, and calibration methods to better catch risky routing changes. Testing on EEG-based commands, they find that combining logging, agreement checks, and confirmation can reduce risks, but no method fully guarantees safety against all attacks.

brain-computer interfaceBCIEEGbrain-prompt injectionaudit logroute safetyconformal calibrationsignal perturbationdecoder robustnessprovenance

Authors

Jianwei Tai

Abstract

BCI-to-agent pipelines turn decoded neural activity into an authorization channel for tool-use agents, exposing a new attack surface we call \emph{brain-prompt injection}: signal-side perturbations, context-only injections, and adaptive dual-decoder attacks can all change the routed action while EEG-side or text-side monitors remain blind. Route safety in this stack depends on what the audit log can observe, not on decoder accuracy or agreement alone. We define a Route-Safety Audit Contract: a minimal log schema, denominator hierarchy, and endpoint specification, and prove an audit-schema separation theorem together with a C3 attacked-dependence decomposition; clean agreement and marginal robustness do not identify the joint term that controls C3 routing. As a calibration layer on top of the contract, we apply split-conformal calibration to a non-oracle EEG confirmation channel and report the resulting false-accept frontier under an explicit threat-archetype matrix. We instantiate the contract on EEGMMI native left/right command-control over 5{,}400 events, harmless tool stubs, and seed/case denominators. Provenance blocks C2 routes ($0.000$); agreement-plus-provenance routes C3 flips ($1.000$); confirmation-plus-provenance routes them ($0.000$). The conformal frontier reaches FAR $0.000$ at clean utility $0.150$ for $α=.005$ and FAR $0.119$ at clean utility $0.452$ for $α=.10$ under acquisition isolation; an attacker-controllable confirmation channel breaks the bound to $\approx\!1$. Subject-cluster bootstrap confirms these intervals on $60$ subjects; cross-architecture (TinyEEGNet, EEGNetV4) and capacity-sweep results show within-regime saturation. Mediation and confirmation reduce risk; they are not intent certificates.

View PDFOpen arXiv