Semiparametric Efficient Test for Interpretable Distributional Treatment Effects
2026-05-08 • Machine Learning
Machine Learning
AI summaryⓘ
The authors discuss how treatment effects can change parts of the outcome distribution that average values miss, like extremes or variability. They introduce DR-ME, a new test that not only detects differences in these distributions but also shows where these differences occur. Using advanced statistical techniques, their method stays accurate in messy real-world data and highlights interpretable locations of change. Their experiments confirm the method works well, including in a medical imaging example.
distributional treatment effectskernel testssemiparametric efficiencydoubly robust estimationcausal inferencelocal testingchi-square testcovariance whiteningsample splittingmedical imaging
Authors
Houssam Zenati, Arthur Gretton
Abstract
Distributional treatment effects can be invisible to means: a treatment may preserve average outcomes while changing tails, modes, dispersion, or rare-event probabilities. Kernel tests can detect discrepancies between interventional outcome laws, but global tests do not reveal where the laws differ. We propose DR-ME, to our knowledge the first semiparametrically efficient finite-location test for interpretable distributional treatment effects. DR-ME evaluates an interventional kernel witness at learned outcome locations, returning causal-discrepancy coordinates rather than only a global rejection. From observational data, we derive orthogonal doubly robust kernel features whose centered oracle form is the canonical gradient of this finite witness. For fixed locations, we characterize the local testing limit: DR-ME is chi-square calibrated under the null, has noncentral chi-square local power, and uses the covariance whitening that optimizes local signal-to-noise for discrepancies visible through the selected coordinates. This efficient local-power geometry yields a principled location-learning criterion, with sample splitting preserving post-selection validity. Experiments show near-nominal type-I error, competitive power against global doubly robust kernel tests, and interpretable learned locations that localize distributional effects in a semi-synthetic medical-imaging study.