Closed-Form Node Classification with Exact Graph Unlearning
2026-05-25 • Machine Learning
Machine Learning
AI summaryⓘ
The authors explore whether certain graph neural network tasks can be done using direct mathematical formulas instead of slow trial-and-error training. They design special closed-form solutions that work well for different types of graphs and match or beat popular neural network methods on many benchmarks. Their approach allows quick updates when parts of the graph change, without needing to retrain from scratch, leading to large speedups. They also study how this method impacts privacy when removing data from the model, showing exact and approximate ways to 'unlearn' graph information.
Graph Neural NetworksNode ClassificationClosed-form SolverHomophilyRidge RegressionGraph Convolutional Network (GCN)Feature PropagationGraph UnlearningAssortative and Heterophilous GraphsLocalized Updates
Authors
Aditya Gaur, Charu Sharma
Abstract
Graph neural networks for node classification are typically trained by gradient descent over hundreds or thousands of epochs. Recent work has shown that, when properly tuned, classic GCN/SAGE/GAT architectures can match graph transformers on many node-classification benchmarks. We ask a complementary question: how much of this performance can be recovered by deterministic closed-form solvers, and what guarantees does this enable? We introduce a routed closed-form framework selected by adjusted homophily. For assortative graphs, we use SGC-style propagation followed by Ridge regression; for heterophilous graphs, we introduce LCF-Net, a layer-wise closed-form graph feature-refinement network whose per-layer Ridge solves are capped by a Gaussian kernel-Ridge head. Across 14 benchmarks, including ogbn-arxiv and ogbn-proteins, our closed-form predictors match or beat the best vanilla 2-layer GCN/SAGE/GAT on 9 of 9 measured datasets, tie tuned deep recipes within one standard deviation on 9 of 12 small benchmarks, and exceed the OGB-leaderboard plain GCN on both large graphs. The remaining heterophilous gap closely tracks the gain from vanilla 2-layer to deep SAGE, suggesting that the residual difference is primarily architectural. Because our predictors are explicit solutions of deterministic linear systems, modified graph inputs can be re-solved to obtain retrain-equivalent parameters. We formalize exact graph-object unlearning for label, feature, edge, node, and subgraph modifications, prove K-hop locality for Ridge components, and verify exactness across 109 configurations. On ogbn-arxiv, localized updates give $21$--$45\times$ speedups over full re-solving and roughly $10^{6}\times$ speedups over gradient retraining. Structural-inversion experiments further quantify the privacy floor of exact retraining and the additional leakage of approximate graph-unlearning methods.