Scaling Novel Graph Generation via Lightweight Structure-Guided Autoregressive Models

2026-06-02 • Machine Learning

Machine LearningArtificial Intelligence

AI summaryⓘ

The authors present a new method to create realistic and diverse graphs more efficiently than existing approaches. Their method arranges graph data into sequences using a special ordering, which lets the computer generate graphs faster and with less complexity. They also use a two-step training process to help the model avoid copying old graphs exactly and instead create new, valid ones. Tests show their approach produces novel and unique graphs without sacrificing accuracy.

graph generationautoregressive modelstopological orderingdiffusion modelssequence modelingLSTMgraph noveltymachine learning scalabilitydata augmentationcausal sequence models

Authors

Alessio Barboni, Massimiliano Lupo Pasini, Bishal Lakha, Edoardo Serra

Abstract

Generating realistic and diverse graphs is a key problem in machine learning, with applications in molecular discovery, circuit design, cybersecurity, and beyond. However, current graph generative models remain limited by scalability and novelty. Diffusion-based methods often require costly full-adjacency operations and long denoising chains, while many autoregressive and hybrid models have at least quadratic complexity. In addition, these models often imitate training graphs rather than generalize beyond them. We propose a lightweight autoregressive framework to address these issues. It uses a structure-guided topological ordering to serialize graphs into regular edge sequences, enabling near log-linear generation, and a two-phase training strategy that combines exploration-oriented augmentation with iterative refinement to reduce overfitting and promote controlled novelty. Experiments on molecular and non-molecular benchmarks show that our approach improves novelty while preserving high validity and uniqueness. The framework also supports both LSTM and Mamba-style causal sequence backbones, with large-memory accelerators enabling longer graph-sequence experiments beyond typical GPU limits.

View PDFOpen arXiv