Keep the Proof State Live: Snapshotting for Efficient Tactic Search in Lean 4

2026-05-25Logic in Computer Science

Logic in Computer ScienceArtificial Intelligence
AI summary

The authors studied a slow part of automated theorem proving in Lean 4, where many proof attempts each redo a lot of work, causing big delays. They found that most time is wasted on reloading libraries and rechecking proof details for every try. To fix this, they created a way to save the state of a partially done proof and reuse it across different proof attempts instead of starting over each time. Their method made the process 5 to 50 times faster on tested problems. This improvement works well alongside other caching methods and will be shared as open source.

Automated theorem provingLean 4Proof stateElaborationImport loadingProof snapshottingParallel tactic searchDraft-Sketch-Prove (DSP)MathlibLean language server
Authors
Austin Shen, Yunong Shi
Abstract
Automated theorem proving systems built on Lean 4 increasingly rely on parallel tactic search over partially specified proofs, such as those generated by Draft-Sketch-Prove (DSP) pipelines. In current systems, each search branch reconstructs a proof state by re-running elaboration, leading to substantial per-branch overhead. In Lean 4 with Mathlib, this cost has two components: (1) import loading, which deserializes pre-compiled libraries (~60 s per branch); and (2) theorem-body elaboration, which re-checks the theorem context up to the target goal (estimated 18-735 s depending on proof complexity). Together, these account for >99% of per-branch wall time, making portfolio-based search impractical at scale. We observe that this overhead arises from a mismatch between the structure of proof search and its execution model: branching is implemented via repeated reconstruction of proof states rather than direct reuse. To address this, we introduce proof-state snapshotting, which captures the elaborated proof state once and reuses it across branches via a small extension to the Lean 4 language server. Across 48 miniF2F-v2 problems (45 prove-phase benchmarks and 3 full end-to-end runs), our approach achieves a 5.6-50x wall-time speedup over the standard fallback (average 14x, median 9.7x). Speedup increases with the number of proof branches. Our method is orthogonal to import-level caching (e.g., Kimina Lean Server), which avoids import loading but not theorem-body elaboration. The patched Lean binary and the Snapshot-DSP pipeline will be released as open source upon publication.