Causal Atlases from Entropic Inference: Bayesian Networks beyond Optimal DAGs

2026-06-04Machine Learning

Machine Learning
AI summary

The authors focus on finding cause-and-effect connections in data using Bayesian networks, which are like maps showing how things might influence each other. Traditional methods pick one best map but can miss other possible maps because real data can support multiple cause-and-effect patterns. Their approach uses entropy, a concept from information theory, to create many possible maps that all fit the data well. This helps show where the data is unclear and avoids mistakes made by just choosing one "optimized" map. They tested this on simulated data and found their method better captures uncertainties in causal relationships.

Bayesian networkscausal relationshipsdirected acyclic graphs (DAGs)entropystructural equation modelsmaximum-entropy ensembleoptimizationcausal inferencecausal ambiguity
Authors
Hazhir Aliahmadi, Irina Babayan, Greg van Anders
Abstract
Data-driven causal relationship identification is pertinent to advancing understanding of complex systems both within and beyond science. Bayesian networks offer a probabilistic method for modelling generic causal relationships via directed acyclic graphs (DAGs). However, typical techniques for constructing Bayesian networks rely on optimization, which can be ill-suited for learning causal relationships because the underlying data may admit multiple chains of causation. More data-faithful representations of causal relationships would provide frameworks for constructing multiple causal maps that are consistent with the variability that is inherent in underlying data. Here, we show that entropy-based inference generates atlases of plausible causal relationships that are consistent with underlying data. On simulated noisy data of 2- and 20-node linear structural equation models, we sample a maximum-entropy ensemble of graphs that allow us to quantify the inherent structural ambiguity in underlying causal relationships. Our method shows that "optimized" DAGs can contain causal artifacts are not consistent across equivalently accurate topologies.