Multi-Level Distributional Entropy for Explainable Network Intrusion Detection

2026-06-29 • Cryptography and Security

Cryptography and SecurityArtificial IntelligenceMachine Learning

AI summaryⓘ

The authors introduce a new way to detect network attacks by measuring different types of entropy (a way to quantify randomness) from summarized network data, instead of needing detailed raw packet information. Their method, called Multi-Level Distributional Entropy (MDE), uses three entropy measures derived directly from flow summaries and does not require training data. They tested MDE on several standard datasets and showed it performs about as well as traditional methods in overall accuracy. However, they also found that some individual performance details, like detection rates for certain attacks, can be much lower despite high overall scores. Additionally, their analysis of the method's explanations showed consistent and meaningful feature importance across different environments.

Network Intrusion Detection System (IDS)EntropyFlow StatisticsDifferential EntropyJensen-Shannon DivergenceShannon EntropyTCP FlagsMachine LearningDetection RateSHAP Explanations

Authors

Mohamed Aly Bouke, Md Shohel Sayeed, Swee-Huay Heng, Azizol Abdullah, Mohamed Othman

Abstract

Machine learning network intrusion detection systems (IDS) rely on aggregate flow statistics that discard distributional structure, while established entropy measures require raw packet sequences unavailable in pre-aggregated flow datasets. We propose Multi-Level Distributional Entropy (MDE), an analytical framework that derives interpretable entropy features directly from flow-level summary statistics at three levels: within-flow Gaussian differential entropy, cross-directional Jensen-Shannon divergence (JSD), and Transmission Control Protocol (TCP) flag-pattern Shannon entropy, without raw packet access or training data. Across four benchmarks (NSL-KDD, CICIDS-2017, CICIDS-2018, UNSW-NB15) under a leakage-free fold-local pipeline, entropy-only features achieve weighted F1 of 0.708-0.989, matching conventional features without degrading performance. Full operational metric reporting then exposes failure modes that aggregate F1 conceals. On CICIDS-2018, F1=0.74 hides a detection rate (DR) of 0.48, and on held-out attack families F1 exceeds 0.998 while DR falls to zero. Under temporal shift, a pseudo-live replay of 703K flows reveals a threshold-ranking divergence in which score ranking is preserved (AUC=0.87) but fixed thresholds collapse (DR=0.082) and recalibration offers no recovery. SHapley Additive exPlanations (SHAP) fold-stability analysis (Spearman rho=0.80-0.95) confirms that entropy attributions are reproducible and domain-coherent across heterogeneous environments.

View PDFOpen arXiv