Different Statistical Perspectives for Understanding Generalisation in Graph Neural Networks

2026-05-25 • Machine Learning

Machine Learning

AI summaryⓘ

The authors review three main ways researchers study how well Graph Neural Networks (GNNs) work on graph data. One way looks at the math behind GNNs’ learning abilities and their complexity. Another way simplifies GNNs by imagining very large or infinite networks to better understand their behavior. The last approach uses random graph models to get precise error estimates in practical settings. They also point out some important findings and open questions in each approach.

Graph Neural NetworksStatistical GeneralisationUniform ConvergenceExpressivityGraph IsomorphismGaussian ProcessesNeural Tangent KernelStochastic Block ModelHigh-dimensional StatisticsGraphon

Authors

Nil Ayday, Mahalakshmi Sabanayagam, Debarghya Ghoshdastidar

Abstract

Graph Neural Networks (GNN) are currently the most popular approach for learning and prediction on graph-structured data and are deployed in various fields, from social network analysis to drug discovery. However, there is limited mathematical understanding of the performance of GNNs. We discuss the various perspectives used to study statistical generalisation in GNNs. We identify three broad frameworks. The first approach, rooted in learning theory, relies on uniform convergence bounds and the complexity of the hypothesis class of specific GNN architectures. This approach also builds on the expressivity of GNNs, typically studied through the lens of graph isomorphism tests. The second principle is to simplify the neural architecture by analysing GNNs under the asymptotics of infinitely many parameters or infinite graph size. This approach approximates GNNs using Gaussian processes, neural tangent kernels or graphon neural network operators, which allow studying the generalisation or stability of trained GNNs. The third framework studies GNNs under random graph models, often the contextual stochastic block model, and derives non-asymptotic error rates using tools from high-dimensional statistics. We highlight some key theoretical results and discuss a few limitations and open research questions for each perspective.

View PDFOpen arXiv