E4GEN: Event-level Explainable Extreme-Enhanced Time-series Generation

2026-06-01Machine Learning

Machine LearningArtificial Intelligence
AI summary

The authors developed E4GEN, a new method to create realistic time series data that better represents rare extreme events, like sudden spikes or drops. Their method uses three parts: one that figures out when to activate signals for extreme events without messing with normal patterns like trends and seasons, another that predicts what kind of extreme event to generate even when labeled data is missing, and a third that controls how these extreme events are injected into the data generation process. They tested E4GEN on multiple datasets and found it performed better than current methods in creating realistic overall data and extreme events.

time series generationextreme eventsdiffusion modelsdenoising processcontrol signalssemantic predictiondata-conditioned trainingsampling mechanismstrend and seasonalitydownstream utility
Authors
Lin Jiang, Dahai Yu, Ximiao Li, Guang Wang
Abstract
Generating realistic time series is essential for scientific research and real-world applications. However, existing methods often emphasize overall distributional fidelity while failing to faithfully capture extreme events. To advance existing research, we propose E4GEN, an explainable diffusion framework for extreme event-aware time-series generation. E4GEN provides systematic insights into when, what, and how to control extreme-event generation through three key components. First, E-Activator learns the dataset-adaptive extreme-control signal activation step during the denoising process without interfering with regular temporal components, including trend and seasonality. Second, E-Predictor determines what control signal to enforce through Self-Driven Semantic Prediction, where each sample derives its own control signal by inferring latent extreme-event information during generation. It also includes a novel Data-Conditioned Training, Noise-Initiated Sampling mechanism to address the issue of unavailable training labels. Third, E-Control specifies how to control extreme-event generation through a trainable Extreme Control Network, which transforms the semantic control signal into layer-wise signals and injects it into the denoising process. We evaluate E4GEN on six datasets with 17 metrics, and extensive experiments show that E4GEN outperforms state-of-the-art models across multiple dimensions, including overall fidelity, extreme-event fidelity, and downstream utility.