Files
CodeScout-1.7B/README.md
ModelHub XC 99ccb9849e 初始化项目,由ModelHub XC社区提供模型
Model: OpenHands/CodeScout-1.7B
Source: Original Platform
2026-06-05 00:06:25 +08:00

4.8 KiB
Raw Blame History

library_name, license, language, base_model, pipeline_tag, tags, datasets
library_name license language base_model pipeline_tag tags datasets
transformers apache-2.0
en
Qwen/Qwen3-1.7B text-generation
code-search
code-localization
reinforcement-learning
agent
software-engineering
GSPO
OpenHands
SWE-Bench
OpenHands/SWE-smith-py-code-search
OpenHands/SWE-Gym-code-search
OpenHands/CodeScout_Training_Rollouts

CodeScout-1.7B

📄 Paper💻 Code🤗 Collection

Compact yet powerful — outperforms 8× larger Qwen3-14B using only a Unix terminal.

CodeScout Overview

CodeScout-1.7B is part of the CodeScout family of open-source RL-trained code search agents. CodeScout models achieve state-of-the-art repository-level code localization using nothing more than a standard Unix terminal — no static analysis, no repository graphs, no language-specific tooling.

Key Highlights

  • Outperforms 8× larger Qwen3-14B with absolute F1 gains of 1118% for files and 1015% for functions
  • Competitive with 18× larger Qwen3-32B (Thinking), surpassing it by 36% in function F1
  • Matches RepoNavigator-7B performance while being 4× smaller
  • Demonstrates that RL + distillation can compress strong code search into a 1.7B model

Results

Performance on SWE-Bench code localization (instance-averaged F1 scores):

Benchmark CodeScout-1.7B CodeScout-4B CodeScout-14B
SWE-Bench Verified — File F1 55.46 68.52 68.57
SWE-Bench Verified — Func F1 28.22 36.78 40.32
SWE-Bench Pro — File F1 40.96 51.77 53.63
SWE-Bench Pro — Func F1 18.24 29.03 28.74
SWE-Bench Lite — File F1 56.57 67.03 71.84
SWE-Bench Lite — Func F1 27.07 39.87 44.43

File-level F1 vs Model Size Function-level F1 vs Model Size

Code localization performance on SWE-Bench Verified. CodeScout () achieves superior or competitive results over larger open-source LLMs and narrows the gap with closed-source frontier models.

Training

CodeScout-1.7B is trained in two stages:

Stage 1 — Rejection Fine-Tuning (RFT): Qwen3-1.7B is warm-started via supervised fine-tuning on 4K perfect-score trajectories (F1 = 1.0 at all granularities) sampled from CodeScout-14B, yielding the CodeScout-1.7B-RFT checkpoint.

Stage 2 — RL Training: CodeScout-1.7B-RFT is further trained with GSPO reinforcement learning.

  • Training data (RL): 800 instances (disjoint from RFT data)
  • RL steps: 100
  • Batch size: 8, with 8 rollouts per instance
  • Max context length: 32K tokens
  • Max turns per episode: 4
  • Reward: Multi-level F1 (file + module + function)
  • Hardware: 8×H100 GPUs
  • Learning rate: 1e-6 (constant)

How It Works

CodeScout uses the OpenHands-Bash scaffold — an agent equipped with only a Terminal tool (supporting standard Unix commands like rg, find, grep, ls) and a LocalizationFinish tool for structured output submission. The agent iteratively navigates the repository to identify relevant files, classes, and functions related to a given issue.

The model is trained with GSPO (Group Sequence Policy Optimization) using multi-level F1 rewards at the file, module, and function level.

Intended Use

CodeScout-1.7B is designed for repository-level code localization: given a GitHub issue description and a code repository, it identifies the relevant files, classes, and functions that need to be modified. It is intended to be used as a localization subagent within larger coding agent pipelines.

Limitations

  • Trained and evaluated exclusively on Python repositories
  • Designed for code localization, not code editing or issue resolution
  • Performance may vary on repositories significantly different from the training distribution
  • Requires the OpenHands-Bash scaffold for optimal performance

Citation

@misc{sutawika2026codescouteffectiverecipereinforcement,
      title={CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents}, 
      author={Lintang Sutawika and Aditya Bharat Soni and Bharath Sriraam R R and Apurva Gandhi and Taha Yassine and Sanidhya Vijayvargiya and Yuchen Li and Xuhui Zhou and Yilin Zhang and Leander Melroy Maben and Graham Neubig},
      year={2026},
      eprint={2603.17829},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2603.17829}, 
}