Instruction set for the representation of graphs

2026-03-11Computation and Language

Computation and LanguageArtificial IntelligenceData Structures and Algorithms
AI summary

The authors created IsalGraph, a way to turn any simple graph into a short string using just nine instructions. Their method uses a virtual machine that builds or moves through the graph step-by-step, and importantly, every possible string corresponds to a valid graph. They also developed algorithms to convert graphs into these strings efficiently and to find a unique string representation. When tested on real datasets, they found that differences between these strings match well with differences between the original graphs, making this approach useful for tasks like comparing graphs or generating them with language models.

graph encodingsimple graphvirtual machinecircular doubly-linked listgraph isomorphismgraph edit distancelexicographic ordergraph traversalstring encodinggraph similarity
Authors
Ezequiel Lopez-Rubio, Mario Pascual-Gonzalez
Abstract
We present IsalGraph, a method for representing the structure of any finite, simple graph as a compact string over a nine-character instruction alphabet. The encoding is executed by a small virtual machine comprising a sparse graph, a circular doubly-linked list (CDLL) of graph-node references, and two traversal pointers. Instructions either move a pointer through the CDLL or insert a node or edge into the graph. A key design property is that every string over the alphabet decodes to a valid graph, with no invalid states reachable. A greedy \emph{GraphToString} algorithm encodes any connected graph into a string in time polynomial in the number of nodes; an exhaustive-backtracking variant produces a canonical string by selecting the lexicographically smallest shortest string across all starting nodes and all valid traversal orders. We evaluate the representation on five real-world graph benchmark datasets (IAM Letter LOW/MED/HIGH, LINUX, and AIDS) and show that the Levenshtein distance between IsalGraph strings correlates strongly with graph edit distance (GED). Together, these properties make IsalGraph strings a compact, isomorphism-invariant, and language-model-compatible sequential encoding of graph structure, with direct applications in graph similarity search, graph generation, and graph-conditioned language modelling