How Far Can Prompting Go for Minimal-Edit Ukrainian Grammatical Error Correction?

2026-06-08Computation and Language

Computation and Language
AI summary

The authors tested various large language models (LLMs) to see how well they can fix grammar mistakes in Ukrainian text without being specially trained for it. They found that using prompts with minimal-edit examples in Ukrainian helps the models perform better, especially for punctuation and case errors. Their best model got very close to the performance of models that were fine-tuned specifically for this task. They also noticed some patterns where models tend to overcorrect certain Ukrainian-specific errors. The authors made their code and results available for others to use.

Large Language ModelsGrammatical Error CorrectionZero-shot LearningFew-shot LearningMinimal-edit PromptsUkrainian LanguageFine-tuningOvercorrectionPrompt EngineeringF0.5 Score
Authors
Kateryna Karpo, Artem Chernodub
Abstract
Fine-tuned Large Language Models (LLMs) dominate in Ukrainian grammatical error correction (GEC), while API-accessed LLMs remain nearly untested on minimal-edit benchmarks. We evaluate 11 commercial LLMs from four providers and one open-source Ukrainian model on the UNLP 2023 GEC-only benchmark, comparing zero-shot, few-shot, minimal-edits, and LLM-assisted prompt optimization strategies. Our best configuration (Gemini 3.1-Pro) reaches F0.5=69.22, closing over 90% of the gap to fine-tuned SOTA (F0.5=73.14). For zero-shot prompts, only Claude models benefit from Ukrainian instructions. However, the best overall results for all models use Ukrainian minimal-edits prompts, whose language-specific rules require Ukrainian to express precisely. LLM-assisted prompt optimization on top of minimal-edits + few-shot achieves the highest score. Detailed minimal-edits instructions yield the largest gains for punctuation and case errors but cause the model to abandon several low-frequency categories. Delving into error analysis, we identify five recurring overcorrection patterns tied to Ukrainian-specific linguistic phenomena. Code, prompts, and outputs are publicly available.