sweep-next-edit-v2-7B/README.md

---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B
tags:
  - code
  - autocomplete
  - next-edit-prediction
  - dpo
language:
  - en
pipeline_tag: text-generation
---

# sweep-next-edit-v2-7B

A 7B parameter model that predicts the next edit a developer will make. Given the current file, recent diffs, and cursor position, the model predicts what code block the developer will change next and how.

## Usage

```bash
pip install transformers torch accelerate
python inference.py
```

See [`inference.py`](inference.py) for a complete working example.

```python
from inference import build_prompt, generate, FileChunk, DIFF_FORMAT

prompt, code_block, block_start, relative_cursor = build_prompt(
    file_path="example.py",
    file_contents=edited_contents,
    cursor_position=cursor_position,
    recent_changes=recent_changes,
    retrieval_chunks=[FileChunk("utils.py", "def helper(): ...")],
    changes_above_cursor=False,
)

completion = generate(model, tokenizer, prompt, device="cuda")
```

### Prompt format

The model uses `<|file_sep|>` delimiters and a `<|cursor|>` marker:

```
<|file_sep|>{file_path}
{file_contents}
{retrieval_chunks}
{recent_changes_as_diffs}
<|file_sep|>original/{file_path}:{start}:{end}
{code_block_before_last_edit}
<|file_sep|>current/{file_path}:{start}:{end}
{code_block_with_cursor_marker}
<|file_sep|>updated/{file_path}:{start}:{end}
{prefill}
```

The model completes the `updated/` section with the predicted new code block.

- **file_path section**: ~300 lines of file context around the cursor
- **retrieval chunks**: Cross-file context from related files
- **recent changes**: Diffs of recent edits in `original:/updated:` format
- **original/**: Code block around cursor before the last edit
- **current/**: Same block with `<|cursor|>` inserted at cursor position
- **updated/**: Model output — the predicted edited code block

### Prefill strategy

The `updated/` section is seeded with a prefill to constrain generation:

- **Default** (`changes_above_cursor=False`): Prefill everything up to the cursor line. The model only generates from the cursor line onward.
- **After insertion** (`changes_above_cursor=True`): Prefill only the first line + trailing blank lines. Gives the model freedom to rewrite lines between the insertion point and cursor.

### Recent changes format

```
<|file_sep|>{file_path}:{start_line}:{end_line}
original:
{old_code}
updated:
{new_code}
```

## Details

Fine-tuned from [Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B) on developer editing traces using SFT, then GRPO, then DPO.

<table>
<tr><td>Base model</td><td>Qwen2.5-Coder-7B</td></tr>
<tr><td>Fine-tuning</td><td>SFT → GRPO → DPO</td></tr>
<tr><td>Parameters</td><td>7B</td></tr>
<tr><td>Precision</td><td>bfloat16</td></tr>
<tr><td>Context length</td><td>32,768 tokens</td></tr>
<tr><td>Architecture</td><td>Qwen2 (28 layers, hidden dim 3584)</td></tr>
<tr><td>Stop tokens</td><td><code>&lt;|endoftext|&gt;</code>, <code>&lt;|file_sep|&gt;</code></td></tr>
<tr><td>Max output tokens</td><td>1024</td></tr>
<tr><td>Decoding</td><td>Greedy (temperature=0)</td></tr>
</table>
初始化项目，由ModelHub XC社区提供模型 Model: sweepai/sweep-next-edit-v2-7B Source: Original Platform 2026-04-11 08:28:57 +08:00			`---`
			`license: apache-2.0`
			`base_model: Qwen/Qwen2.5-Coder-7B`
			`tags:`
			`- code`
			`- autocomplete`
			`- next-edit-prediction`
			`- dpo`
			`language:`
			`- en`
			`pipeline_tag: text-generation`
			`---`

			`# sweep-next-edit-v2-7B`

			`A 7B parameter model that predicts the next edit a developer will make. Given the current file, recent diffs, and cursor position, the model predicts what code block the developer will change next and how.`

			`## Usage`

			```bash
			`pip install transformers torch accelerate`
			`python inference.py`
			```

			See [`inference.py`](inference.py) for a complete working example.

			```python
			`from inference import build_prompt, generate, FileChunk, DIFF_FORMAT`

			`prompt, code_block, block_start, relative_cursor = build_prompt(`
			`file_path="example.py",`
			`file_contents=edited_contents,`
			`cursor_position=cursor_position,`
			`recent_changes=recent_changes,`
			`retrieval_chunks=[FileChunk("utils.py", "def helper(): ...")],`
			`changes_above_cursor=False,`
			`)`

			`completion = generate(model, tokenizer, prompt, device="cuda")`
			```

			`### Prompt format`

			The model uses `<\|file_sep\|>` delimiters and a `<\|cursor\|>` marker:

			```
			`<\|file_sep\|>{file_path}`
			`{file_contents}`
			`{retrieval_chunks}`
			`{recent_changes_as_diffs}`
			`<\|file_sep\|>original/{file_path}:{start}:{end}`
			`{code_block_before_last_edit}`
			`<\|file_sep\|>current/{file_path}:{start}:{end}`
			`{code_block_with_cursor_marker}`
			`<\|file_sep\|>updated/{file_path}:{start}:{end}`
			`{prefill}`
			```

			The model completes the `updated/` section with the predicted new code block.

			`- file_path section: ~300 lines of file context around the cursor`
			`- retrieval chunks: Cross-file context from related files`
			- recent changes: Diffs of recent edits in `original:/updated:` format
			`- original/: Code block around cursor before the last edit`
			- current/: Same block with `<\|cursor\|>` inserted at cursor position
			`- updated/: Model output — the predicted edited code block`

			`### Prefill strategy`

			The `updated/` section is seeded with a prefill to constrain generation:

			- Default (`changes_above_cursor=False`): Prefill everything up to the cursor line. The model only generates from the cursor line onward.
			- After insertion (`changes_above_cursor=True`): Prefill only the first line + trailing blank lines. Gives the model freedom to rewrite lines between the insertion point and cursor.

			`### Recent changes format`

			```
			`<\|file_sep\|>{file_path}:{start_line}:{end_line}`
			`original:`
			`{old_code}`
			`updated:`
			`{new_code}`
			```

			`## Details`

			`Fine-tuned from [Qwen2.5-Coder-7B](https://huggingface.co/Qwen/Qwen2.5-Coder-7B) on developer editing traces using SFT, then GRPO, then DPO.`

			`<table>`
			`<tr><td>Base model</td><td>Qwen2.5-Coder-7B</td></tr>`
			`<tr><td>Fine-tuning</td><td>SFT → GRPO → DPO</td></tr>`
			`<tr><td>Parameters</td><td>7B</td></tr>`
			`<tr><td>Precision</td><td>bfloat16</td></tr>`
			`<tr><td>Context length</td><td>32,768 tokens</td></tr>`
			`<tr><td>Architecture</td><td>Qwen2 (28 layers, hidden dim 3584)</td></tr>`
			`<tr><td>Stop tokens</td><td><code><\|endoftext\|></code>, <code><\|file_sep\|></code></td></tr>`
			`<tr><td>Max output tokens</td><td>1024</td></tr>`
			`<tr><td>Decoding</td><td>Greedy (temperature=0)</td></tr>`
			`</table>`