Qwen Goes Brrr: Off-the-Shelf RAG for Ukrainian Multi-Domain Document Understanding

2026-05-11Computation and Language

Computation and LanguageArtificial IntelligenceInformation RetrievalMachine Learning
AI summary

The authors took part in a competition where computers had to answer Ukrainian multiple-choice questions by finding information inside PDF documents. They built a system that first breaks down PDFs carefully, then retrieves parts of the documents based on the question and possible answers, and finally picks answers from the best pieces found. Their approach improved how well the system identified the right passages and increased its accuracy at answering questions. They found that keeping the document’s original layout and tailoring the search to both questions and answers worked better than using complicated extra tricks.

multi-domain document understandingPDF chunkingdense retrievalrerankinganswer generationRecall@1Qwen modelsmultiple-choice question answeringinformation retrieval
Authors
Anton Bazdyrev, Ivan Bashtovyi, Ivan Havlytskyi, Oleksandr Kharytonov, Artur Khodakovskyi
Abstract
We participated in the Fifth UNLP shared task on multi-domain document understanding, where systems must answer Ukrainian multiple-choice questions from PDF collections and localize the supporting document and page. We propose a retrieval-augmented pipeline built around three ideas: contextual chunking of PDFs, question-aware dense retrieval and reranking conditioned on both the question and answer options, and constrained answer generation from a small set of reranked passages. Our final system uses Qwen3-Embedding-8B for retrieval, a fine-tuned Qwen3-Reranker-8B for passage ranking, and Qwen3-32B for answer selection. On a held-out split, reranking improves Recall@1 from 0.6957 to 0.7935, while using the top-2 reranked passages raises answer accuracy from 0.9348 to 0.9674. Our best leaderboard run reached 0.9452 on the public leaderboard and 0.9598 on the private leaderboard. Our results suggest that, under strict code-competition constraints, preserving document structure and making relevance estimation aware of the answer space are more effective than adding complex downstream heuristics.