OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

2026-05-05 • Artificial Intelligence

Artificial IntelligenceComputation and Language

AI summaryⓘ

The authors show that training advanced search agents for large language models doesn’t always require complex and resource-heavy methods. By making three simple changes to their training data—larger knowledge graphs, more tools, and strict filtering—they used a simpler training method called supervised fine-tuning (SFT). Their model, OpenSeeker-v2, trained on a small dataset, outperforms several benchmarks and even beats models trained with more complicated pipelines. This work is notable as it was achieved purely by an academic team and is open-sourced for others to use.

Large Language ModelsSupervised Fine-TuningKnowledge GraphTool AugmentationReAct ParadigmBrowseCompContinual Pre-TrainingReinforcement LearningSearch AgentsOpen Source

Authors

Yuwen Du, Rui Ye, Shuo Tang, Keduan Huang, Xinyu Zhu, Yuzhu Cai, Siheng Chen

Abstract

Deep search capabilities have become an indispensable competency for frontier Large Language Model (LLM) agents, yet their development remains dominated by industrial giants. The typical industry recipe involves a highly resource-intensive pipeline spanning pre-training, continual pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning (RL). In this report, we show that when fueled with informative and high-difficulty trajectories, a simple SFT approach could be surprisingly powerful for training frontier search agents. By introducing three simple data synthesis modifications: scaling knowledge graph size for richer exploration, expanding the tool set size for broader functionality, and strict low-step filtering, we establish a stronger baseline. Trained on merely 10.6k data points, our OpenSeeker-v2 achieves state-of-the-art performance across 4 benchmarks (30B-sized agents with ReAct paradigm): 46.0% on BrowseComp, 58.1% on BrowseComp-ZH, 34.6% on Humanity's Last Exam, and 78.0% on xbench, surpassing even Tongyi DeepResearch trained with heavy CPT+SFT+RL pipeline, which achieves 43.4%, 46.7%, 32.9%, and 75.0%, respectively. Notably, OpenSeeker-v2 represents the first state-of-the-art search agent within its model scale and paradigm to be developed by a purely academic team using only SFT. We are excited to open-source the OpenSeeker-v2 model weights and share our simple yet effective findings to make frontier search agent research more accessible to the community.

View PDFOpen arXiv