Redefining Maritime Anomaly Detection via Equation-Grounded Synthetic Anomalies
2026-06-29 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors focus on detecting unusual events in ship movements to improve safety at sea using data from the Automatic Identification System (AIS). Since existing AIS datasets often lack clear labels for anomalies, they create a new way to classify anomalies into three types, including unexpected behavior, route changes, and close encounters between ships. They develop a method that scores and labels these anomalies with the help of large language models (LLMs) and test various detection models under different conditions. Their work provides a structured approach to better detect and evaluate maritime anomalies.
Maritime anomaly detectionAutomatic Identification System (AIS)Anomaly taxonomyRoute deviationClose approachLarge language models (LLMs)Time-series modelsTraffic managementMaritime safety
Authors
Youngseok Hwang, Sungho Bae, Dohun Lee, Jaeeun Seo, Jeehong Kim, Wonhee Lee, Hyunwoo Park
Abstract
Maritime anomaly detection is essential for ensuring maritime safety, security, and efficient traffic management at sea, with Automatic Identification System (AIS) data serving as a primary data source. Despite its importance, most publicly available AIS datasets lack predefined anomaly labels, forcing prior studies to rely on either distribution-based rarity or domain rule/expert-assisted labeling. These approaches, however, face fundamental limitations: statistical rarity often fails to reflect practically critical events, while expert-based labeling is costly, subjective, and difficult to scale. Moreover, both paradigms tend to overlook interaction-driven hazards such as near-miss approaches between vessels. To address these challenges, we propose an equation-grounded anomaly taxonomy that is implementable under a limited AIS observation schema and extensible to other AIS datasets. Specifically, the taxonomy defines three anomaly types: unexpected AIS activity (A1), route deviation (A2), and close approach (A3), covering both single-vessel and inter-vessel anomalies. Building on this taxonomy, we introduce a unified score-synthesize-label pipeline that produces LLM-guided plausibility scores, uses them to synthesize anomalies, and assigns timestamp-level labels. To rigorously assess detection performance, we further design benchmark evaluation settings that account for variations in temporal-window length and anomaly-type composition, and evaluate a broad range of time-series models and anomaly detection models. Together, these contributions provide a systematic basis for evaluating maritime anomaly detection methods across different anomaly types. Our code is available at https://github.com/snudial/open-maritime-anomaly-detection.