TN-SHAP-G: Graph-Structured Tensor Network Surrogates for Shapley Values and Interactions

2026-06-01Machine Learning

Machine LearningArtificial Intelligence
AI summary

The authors developed TN-SHAP-G, a method to quickly calculate Shapley values, which show how important each part of a graph-shaped input is to a model's prediction. Instead of checking all possible parts, which is very slow, their approach builds a smaller, simpler model that copies how the original model reacts to missing parts. This smaller model uses a special structure that matches the input graph, enabling precise calculation of importance scores without needing many extra checks. Tests on molecule data showed their method is accurate on small examples and works well on bigger graphs where other methods struggle.

Shapley valuesgraph-structured inputstensor networksmultilinear surrogatemachine learning interpretabilitymasking schemeinteraction indicesmolecular benchmarksmodel explanation
Authors
Farzaneh Heidari, Guillaume Rabusseau
Abstract
Shapley values are a widely used tool for attributing importance and interactions among input variables in black-box models, but their computation involves a function defined over an exponentially large space of subsets. We propose TN-SHAP-G, a framework that exploits structure in graph-structured inputs to compute Shapley values and higher-order interaction indices efficiently. Given a predictor and a fixed masking scheme, TN-SHAP-G learns a compact, graph-aligned multilinear surrogate that approximates the masked-input behavior, represented as a tensor network whose topology mirrors the input graph. Once trained from a small number of oracle queries, the surrogate enables deterministic recovery of first- and higher-order Shapley indices via the multilinear extension, without additional model queries or Monte Carlo variance. Experiments on molecular benchmarks show that the learned factorization closely matches exact Shapley values on small graphs and scales efficiently to larger graphs where sampling-based methods become infeasible.