ARMOR: Adaptive Retriever Optimization for Low-Resource Telecom Question Answering

2026-06-29Information Retrieval

Information RetrievalArtificial IntelligenceComputation and LanguageMachine Learning
AI summary

The authors study how to improve question-answering systems in telecom, where evidence is scattered across many technical sources. Instead of changing the answer generator, they try adapting the part that finds relevant documents for a fixed generator. They introduce ARMOR, a method that tunes the retriever using two goals: helping the generator and improving retrieval quality, while keeping the retriever close to its original state. Their approach leads to better retrieval and answers on telecom datasets.

retrieval-augmented generationtelecom QAquery encoderlatent-document likelihoodInfoNCEcontrastive learningretriever fine-tuningdocument retrievalnatural language generationARMOR
Authors
Heshan Fernando, Quan Xiao, Yan Xin, Tianyi Chen
Abstract
Telecom question answering (QA) is a challenging setting for retrieval-augmented generation (RAG): evidence is fragmented across standards, papers, encyclopedic resources, and web documents, and answers often hinge on technical tables, equations, and specialized protocol language. In low-resource subdomains, generator fine-tuning can over-specialize and degrade general capability, making query-side retriever adaptation an attractive alternative. To this end, we ask whether a fixed-generator, query-adapted RAG system can outperform generator-side adaptation, and which retriever objectives best support that setting. We motivate retrieval, rather than generator fine-tuning, as the adaptation target through a capacity comparison: under bounded-parameter and soft-retrieval assumptions, query-encoder tuning can have a smaller estimation term than supervised fine-tuning when its effective dimension is smaller. We identify two particularly relevant objectives -- the latent-document RAG likelihood, which optimizes generation utility, and the InfoNCE contrastive objective, which improves semantic retrieval geometry -- and leverage them jointly through a retriever optimization method targeting downstream QA performance in the telecom domain. Specifically, we introduce ARMOR, Adaptive Regularized Mixture Optimization for Retrievers, which learns separate temperatures for the RAG retrieval distribution and InfoNCE softmax and regularizes the adapted query encoder toward the frozen base query encoder. Across telecom-specific retrieval and generative QA benchmarks, we show that ARMOR improves evidence retrieval and answer generation in several in-domain settings. Code is available at https://github.com/heshandevaka/ARMOR.git.