Conversations Risk Detection LLMs in Financial Agents via Multi-Stage Generative Rollout

2026-04-10 • Cryptography and Security

Cryptography and SecurityComputational Engineering, Finance, and Science

AI summaryⓘ

The authors propose FinSec, a new system designed to detect risky or unsafe language in financial agent conversations. Unlike older methods that check only one part of the conversation or use fixed rules, FinSec looks at how the meaning changes over several exchanges and uses multiple steps to spot suspicious or risky behavior. Their tests show FinSec works better at catching unsafe dialogue, reducing errors, and balancing safety with usefulness. Overall, FinSec offers a more reliable way to keep financial chatbots secure when handling sensitive information.

large language modelsfinancial securitydialogue detectionsemantic analysisadversarial inferencerisk assessmentF1 scoreAUPRCASRmulti-turn dialogue

Authors

Xiaotong Jiang, Jun Wu

Abstract

With the rapid adoption of large language models (LLMs) in financial service scenarios, dialogue security detection under high regulatory risk presents significant challenges. Existing methods mainly rely on single-dimensional semantic judgments or fixed rules, making them inadequate for handling multi-turn semantic evolution and complex regulatory clauses; moreover, they lack models specifically designed for financial security detection. To address these issues, this paper proposes FinSec, a four-tier security detection framework for financial agent. FinSec enables structured, interpretable, and end-to-end identification of actual financial risks, incorporating suspicious behavior pattern analysis, delayed risk and adversarial inference, semantic security analysis, and integrated risk-based decision-making. Notably, FinSec significantly enhances the robustness of high-risk dialogue detection while maintaining model utility. Experimental results demonstrate FinSec's leading performance. In terms of overall detection capability, FinSec achieves an F1 score of 90.13%, improving upon baseline models by 6--14 percentage points; its ASR is reduced to 9.09%, markedly lowering the probability of unsafe outputs; and the AUPRC increases to 0.9189 -- an approximate 9.7% gain over general frameworks. Additionally, in balancing utility and safety, FinSec obtains a composite score of 0.9098, delivering robust and efficient protection for financial agent dialogues.

View PDFOpen arXiv