A state-of-the-art 8B parameter web agent model designed for complex information-seeking tasks and long-horizon reasoning.
## 🌟 Overview
WebExplorer-8B is an advanced web navigation agent trained on **WebExplorer**-QA. The model demonstrates exceptional performance on challenging information-seeking benchmarks while maintaining efficiency with only 8 billion parameters.
## ✨ Key Features
- 🌐 **Long-horizon Reasoning**: Supports up to 128K context length and 100 tool calling turns
- 🛠️ **Tool Utilization**: Masters search and browse functionalities
- 🏆 **State-of-the-art Performance**: Achieves best-in-class results among models under 10B parameters
## 🏗️ Model Architecture
Built on Qwen3-8B base model and trained through a two-phase approach:
1.**Supervised Fine-tuning (SFT)**: Cold-start initialization with high-quality trajectories
2.**Reinforcement Learning (RL)**: Enhanced using GRPO algorithm with progressive context expansion
Accuracy (%) of web agents on information-seeking benchmarks. BC-en and BC-zh denote BrowseComp-en and BrowseComp-zh respectively. XBench-DS refers to XBench-DeepSearch. **Bold** indicates the best performance among open-source models <100B,while<u>underlined</u> values represent the best performance among models <10Bparameters.AllscoresofWebExplorer-8BarecomputedasAvg@4usingLLM-as-Judge.Entriesmarkedwithadagger(†)werereproducedbyusunderourscaffold:onmodelname =entirerow;onanumber =thatentryonly.
If you find our work useful, please consider citing:
```bibtex
@misc{liu2025webexplorer,
title={WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents},
author={Junteng Liu and Yunji Li and Chi Zhang and Jingyang Li and Aili Chen and Ke Ji and Weiyu Cheng and Zijia Wu and Chengyu Du and Qidi Xu and Jiayuan Song and Zhengmao Zhu and Wenhu Chen and Pengyu Zhao and Junxian He},