PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing

2026-06-01Computation and Language

Computation and LanguageArtificial Intelligence
AI summary

The authors created a test called PlanarBench to check if large language models (LLMs) can draw flat graphs (planar graphs) using just a list of connections between points. This is tricky because the order of connections, their directions, and the labels on points can all change without changing the graph itself. They tested 91 different models on 199 simple graphs with 2 to 7 points. Their main finding is that how many connections (edges) a graph has is the biggest factor in how hard it is for models to draw, more so than the number of points. This insight was not found in previous tests that only looked at the number of points.

Planar graphASCII artLarge language modelsEdge listSpatial reasoningNon-isomorphic graphsConnected graphsGraph drawingEdge countNode labels
Authors
Oleksandr Nikitin
Abstract
PlanarBench tests whether LLMs can draw planar graphs as ASCII art given only an edge list -- a spatial reasoning task that resists memorization because edge order, edge orientation, and node labels are all permutable. We evaluate 91 models on the 199 simplest non-isomorphic connected planar graphs (2 - 7 vertices). Edge count is the dominant difficulty predictor ($r = -0.85$) -- a finding not reported in prior LLM graph benchmarks, which use only node count as the difficulty axis.