Context-as-a-Service: Surfacing Cross-File Dependency Chains for LLM-Generated Developer Documentation

2026-06-03Software Engineering

Software EngineeringInformation Retrieval
AI summary

The authors created a tool called Context-as-a-Service (CaaS) that helps language models better find and use information from different parts of a codebase when writing or checking developer documentation. CaaS searches source code, API references, and documentation more effectively than simple keyword search alone. In tests, it helped the language model find errors and missing details that were missed otherwise, especially by connecting related information across multiple files. Using CaaS also made the process faster and used fewer tokens. This shows that smarter retrieval layers can improve how AI assists in documenting code.

LLM agentsdeveloper documentationcodebase indexingAPI referencesemantic searchcross-file dependenciesretrieval layertoken usagesoftware development kit (SDK)code navigation
Authors
Ameya Gawde, Vyzantinos Repantis, Harshvardhan Singh, Lucy Moys
Abstract
LLM agents increasingly write and maintain developer documentation, but usefulness and accuracy often rely on dependency chains that are not obvious to follow. Even with more files in context, the agent must still decide which cross-file dependencies to trace. We present Context-as-a-Service (CaaS), a retrieval layer that LLM agents query to find evidence across the codebase as they review or generate documentation. CaaS indexes source code, API references, and upstream documentation, then enables agents to query the index through tool calls that combine keyword and semantic search. We evaluate CaaS in two case studies using Claude Sonnet 4.6 on a production SDK: improving API reference comments in a core source file and validating an LLM-generated tutorial. In both studies, the baseline already had ordinary repository tools such as file reads, keyword search, and symbol navigation. CaaS adds a retrieval layer on top, so the comparison isolates added retrieval rather than basic repository access. In the API-reference review, the CaaS-augmented agent produced the same 5 missing-documentation fixes as the baseline and surfaced 4 findings the baseline missed: 2 cross-file factual errors and 2 underspecified API comments. In the tutorial validation, it surfaced 1 executable bug, 1 API-usage improvement, and 2 missing prerequisites that the baseline pipeline did not catch. These findings required tracing non-obvious dependency chains across utility files, framework internals, usage examples, tests, and component-creation logic. Over five runs per condition, adding CaaS reduced wall-clock time by 22\% to 34\% across the two tasks and lowered input-token usage.