Model Introduction
We introduce O-Researcher, a new family of research agents that natively perform end-to-end, multi-turn, multi-tool deep research problem solving—without external frameworks or manual intervention. Built on a Multi-Agent Data Synthesis paradigm, O-Researcher leverages collaborative AI agents to simulate complex tool-integrated reasoning and automatically generate diverse, research-grade instructional data. To train these models, we develop a two-stage strategy combining supervised fine-tuning with agentic reinforcement learning on verifiable multi-domain tasks, maximizing both alignment and research capability. O-Researcher sets new state-of-the-art results on major deep research benchmarks, and we release all model weights, inference code, and datasets to accelerate future research on agentic AI. For more details, please refer to our paper and GitHub.
Model Downloads
| Model | Download | Backbone Model | License |
|---|---|---|---|
| O-Researcher-72B-rl | 🤗 HuggingFace | Qwen-2.5-72B-Instruct | Apache License 2.0 |
| O-Researcher-72B-sft | 🤗 HuggingFace | Qwen-2.5-72B-Instruct | Apache License 2.0 |
Data Downloads
Citation
If you find O-Researcher useful in your research or applications, we would appreciate it if you could cite our work:
@misc{yao2026oresearcheranopenended,
title={O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL},
author={Yi Yao and He Zhu and Piaohong Wang and Jincheng Ren and Xinlong Yang and Qianben Chen and Xiaowan Li and Dingfeng Shi and Jiaxian Li and Qiexiang Wang and Sinuo Wang and Xinpeng Liu and Jiaqi Wu and Minghao Liu and Wangchunshu Zhou},
year={2026},
eprint={2601.03743},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.03743}
}
- Downloads last month
- 8