A Dual-Track Framework for Template-Constrained LaTeX Conversion

2026-06-22 • Computation and Language

Computation and Language

AI summaryⓘ

The authors address the difficulty of converting structured Markdown documents into LaTeX formats that follow specific templates. They note that previous methods either used strict rules, which struggled with complex elements, or fully relied on AI models, which sometimes made errors that are hard to fix. Their solution is a two-part system: one part extracts template rules, and the other combines AI only for tricky parts while using rules for straightforward tasks. Tests on various papers and templates showed their approach keeps the structure more accurate and compiles better than older methods.

MarkdownLaTeXdocument conversiontemplate constraintsrule-based systemsLarge Language Modelssemantic metadatabibliographic referencescompilation successdual-track framework

Authors

Chung Cheuk Hei, Liu Li

Abstract

With the increasing demands for advanced document conversion, mapping structured Markdown drafts into template-compliant formats like LaTeX remains a challenge. Existing approaches largely depend on either deterministic rule-based converters or pure end-to-end Large Language Model (LLM) generation. The former fails to correctly handle asset insertions and template-specific constraints, while the latter tends to induce semantic drift, leading to hallucinations that are difficult to debug. To address these limitations, we introduce a robust Dual-Track Framework that systematically decouples template formatting from document processing: an offline track extracts template constraints into a reusable manifest, while an online track implements a hybrid execution pipeline. This pipeline confines LLM usage exclusively to reasoning-intensive components (e.g., semantic metadata, bibliographic references, and complex visual/tabular layouts) while delegating rule-based engines for deterministic processing. Empirical evaluation across 7 LaTeX templates and 56 published research papers demonstrates that our method preserves better structural fidelity, satisfies diverse layout constraints, and achieves a higher compilation success rate compared to the previous baselines.

View PDFOpen arXiv