A Dual-Track Framework for Template-Constrained LaTeX Conversion
2026-06-22 • Computation and Language
Computation and Language
AI summaryⓘ
The authors address the difficulty of converting structured Markdown documents into LaTeX formats that follow specific templates. They note that previous methods either used strict rules, which struggled with complex elements, or fully relied on AI models, which sometimes made errors that are hard to fix. Their solution is a two-part system: one part extracts template rules, and the other combines AI only for tricky parts while using rules for straightforward tasks. Tests on various papers and templates showed their approach keeps the structure more accurate and compiles better than older methods.
MarkdownLaTeXdocument conversiontemplate constraintsrule-based systemsLarge Language Modelssemantic metadatabibliographic referencescompilation successdual-track framework
Authors
Chung Cheuk Hei, Liu Li
Abstract
With the increasing demands for advanced document conversion, mapping structured Markdown drafts into template-compliant formats like LaTeX remains a challenge. Existing approaches largely depend on either deterministic rule-based converters or pure end-to-end Large Language Model (LLM) generation. The former fails to correctly handle asset insertions and template-specific constraints, while the latter tends to induce semantic drift, leading to hallucinations that are difficult to debug. To address these limitations, we introduce a robust Dual-Track Framework that systematically decouples template formatting from document processing: an offline track extracts template constraints into a reusable manifest, while an online track implements a hybrid execution pipeline. This pipeline confines LLM usage exclusively to reasoning-intensive components (e.g., semantic metadata, bibliographic references, and complex visual/tabular layouts) while delegating rule-based engines for deterministic processing. Empirical evaluation across 7 LaTeX templates and 56 published research papers demonstrates that our method preserves better structural fidelity, satisfies diverse layout constraints, and achieves a higher compilation success rate compared to the previous baselines.