From Fragments to Paths: Task-Level Context Recovery for Large Industrial Codebases

2026-06-22 • Software Engineering

Software EngineeringArtificial Intelligence

AI summaryⓘ

The authors introduce DeepDiscovery, a new method to help large language models better understand big software projects by finding important parts of the code and the wider context around them. Unlike previous approaches that only looked at small code snippets, DeepDiscovery works in two steps to capture the bigger picture across complex code relationships, even with limited computing resources. They tested it on both real company projects and benchmarks, showing it finds more relevant files and improves software engineering tasks. Overall, the authors demonstrate that better understanding of entire code repositories helps AI coding tools perform complex tasks more effectively.

large language modelssoftware engineeringcode repositoriestask-level understandingfile recoverymulti-relational repository structurecoding agentsSWE-bench Verifiedrecall rateintegrated codebase

Authors

Jiawei He, Weisong Sun, Mengyu Shi, Jie Jia, Tong Bian, Xikai Yang, Dong Sun

Abstract

Large language models have shown strong performance on software engineering (SE) tasks, yet understanding large industrial repositories remains challenging. Existing methods often retrieve only local fragments and fail to recover the broader task-relevant context needed for complex repository-level tasks. We present DeepDiscovery, a task-level repository-understanding method for large industrial codebases. DeepDiscovery uses a two-stage \textit{Location--Inference} framework to localize high-confidence task anchors and recover broader task-relevant context over multi-relational repository structure under budget constraints. Across controlled method-level evaluation, organization-internal industrial repository-understanding scenarios, and end-to-end evaluation on SWE-bench Verified, DeepDiscovery consistently improves task-relevant file recovery and downstream SE performance. On 27 medium-scale tasks, DeepDiscovery achieves the best file recovery quality among five representative baselines without offline preprocessing. On organization-internal industrial tasks from a production-scale integrated codebase ecosystem, including 27 medium-scale tasks and 40 large-scale tasks, DeepDiscovery improves Full Recall Rate across multiple AI coding systems, with absolute gains ranging from 1.6 to 9.2 percentage points on large subprojects and from 2.5 to 7.4 percentage points on medium-scale subprojects. In a controlled end-to-end evaluation on SWE-bench Verified, a system equipped with DeepDiscovery achieves a 78.6\% Solve Rate, outperforming the corresponding baseline by 8.2 percentage points. These results suggest that stronger task-level repository understanding can improve coding-agent performance on complex SE tasks.

View PDFOpen arXiv