Estimating Mutual Information between Time Series and Temporal Event Sequences Across Diverse Analysis Tasks

2026-06-01Machine Learning

Machine LearningArtificial IntelligenceInformation Theory
AI summary

The authors created a new method to measure how linked two different types of time data are—one type being continuous signals and the other being sequences of events. Their approach works without changing the data or making guesses, which helps avoid common mistakes seen in older methods. They tested their method on several tasks and found it to be more accurate and reliable than existing techniques. Essentially, they made a tool that can better understand relationships in mixed time data.

mutual informationtime seriesdiscrete event sequencesnonparametric estimationcausality analysistemporal data miningdata quantizationlatent event clusteringdependence measuresfeature selection
Authors
Haoji Hu, Huaqing Mao, Yijun Lin, Xiaowei Jia, Jinwei Zhou, Minoh Jeong, Yao-Yi Chiang
Abstract
Pairwise dependence measures such as correlation and causality are fundamental to temporal data mining, yet there is still no principled and robust way to quantify dependence between heterogeneous data types, especially between continuous time series and discrete temporal event sequences. Existing approaches rely on ad hoc transformations or mutual-information estimators that are highly sensitive to quantization, repeated values, and event redundancy, leading to biased or unstable results in practice. We propose a nonparametric mutual information estimator that directly measures the dependence between time series and event sequences without data transformation, learning, or ad hoc discretization. Our method models the continuous-discrete duality of real-world time series to handle quantization and repeated-value artifacts and introduces a latent event clustering strategy to mitigate bias from event co-occurrence and redundancy. Together, these yield a robust and unified framework that bridges discrete and continuous mutual information. We evaluate the proposed estimator on four representative tasks: discrete-continuous time-delayed mutual information for causality analysis, global and local temporal repetition discovery, discrete covariate selection for time series forecasting, and continuous feature selection for classification. Experiments on synthetic and real-world datasets show consistent improvements over existing methods in accuracy, robustness, and interpretability, positioning our approach as a general-purpose dependence operator for heterogeneous temporal data, similar to Pearson correlation for homogeneous time series. Code available at: https://github.com/HaojiHu/Multimodal-Temporal-Data-Quantification