Files
qwen25-3b-openclaw/README.md
ModelHub XC ac2d159a9d 初始化项目,由ModelHub XC社区提供模型
Model: sunkencity/qwen25-3b-openclaw
Source: Original Platform
2026-05-27 02:57:15 +08:00

193 lines
6.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
base_model: Qwen/Qwen2.5-3B-Instruct
tags:
- tool-use
- function-calling
- qwen2.5
- mlx
- lora
- openclaw
- localclaw
- vllm
- agent
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
# qwen25-3b-openclaw
A **Qwen2.5-3B-Instruct** model fine-tuned for exceptional tool/function calling ability, purpose-built as the local agent model for [OpenClaw](https://github.com/openclaw/openclaw) / LocalClaw served via vLLM.
## Model Summary
| Property | Value |
|----------|-------|
| **Base model** | [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) |
| **Fine-tuning method** | LoRA (rank=16, alpha=32, all 32 layers) |
| **Training framework** | [mlx-lm](https://github.com/ml-explore/mlx-lm) on Apple M4 Max |
| **Training data** | ~57k tool-call examples (hermes-function-calling-v1 + glaive-function-calling-v2) |
| **Training steps** | 600 |
| **Tool format** | `<tool_call>` / `</tool_call>` (Qwen/Hermes convention) |
| **Serving target** | vLLM with `--enable-auto-tool-choice --tool-call-parser hermes` |
## Evaluation
Evaluated on a held-out set of 50 tool-calling examples from the training distribution.
| Metric | Score |
|--------|-------|
| **tool_score** (composite) | **0.989** |
| name_accuracy | 1.000 |
| arg_f1 | 0.983 |
| parse_rate | 0.980 |
| val_loss | 0.010 |
- **name_accuracy**: fraction of examples where the correct function name was called
- **arg_f1**: token-level F1 between predicted and ground-truth arguments
- **parse_rate**: fraction of outputs that contained a valid, parseable `<tool_call>` block
- **tool_score**: `name_accuracy × 0.4 + arg_f1 × 0.4 + parse_rate × 0.2`
## What It's Good For
- **OpenClaw / LocalClaw agent** — drop-in local model for the tool-calling tier; handles calendars, email, web search, browser control, and custom skill tools
- **Any OpenAI-compatible tool-use pipeline** — responds to the standard `tools` parameter and produces structured function calls
- **Offline / privacy-first deployments** — 3B parameters runs fast on Apple Silicon or a modest GPU; no cloud dependency
- **Multi-tool selection** — trained on examples with multiple available tools; reliably selects the right one
- **Argument extraction** — near-perfect extraction of typed arguments from natural-language queries
## What It's Not Good For
- Long multi-turn reasoning chains (consider a larger model for orchestration)
- Tasks requiring no tools — the model is biased toward calling tools when they're available
- Languages other than English (training data is English-only)
## Tool Call Format
The model outputs tool calls using the Hermes `<tool_call>` convention:
```
<tool_call>
{"name": "function_name", "arguments": {"arg1": "value1", "arg2": 42}}
</tool_call>
```
Multiple parallel calls are output sequentially, each in its own block.
The system prompt should include available tools in `<tools></tools>` XML tags:
```
You are a function calling AI model. You are provided with function signatures
within <tools> </tools> XML tags. You may call one or more functions to assist
with the user query.
<tools>
{"type": "function", "function": {"name": "get_weather", "description": "...", "parameters": {...}}}
</tools>
For each function call return a json object within <tool_call> </tool_call> tags.
```
## Quick Start with vLLM
```bash
pip install vllm
vllm serve sunkencity/qwen25-3b-openclaw \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--port 8000
```
This exposes an OpenAI-compatible API at `http://localhost:8000/v1` with structured `tool_calls` in responses.
## Integration with OpenClaw / LocalClaw
LocalClaw auto-discovers models from a running vLLM server. After starting the server above, add to `~/.localclaw/openclaw.local.json`:
```json
{
"agents": {
"defaults": {
"model": {
"primary": "vllm/sunkencity/qwen25-3b-openclaw"
}
}
}
}
```
LocalClaw will route tool-calling tasks to this model automatically via the three-tier routing system.
## Python Usage (mlx-lm)
```python
from mlx_lm import load, generate
model, tokenizer = load("sunkencity/qwen25-3b-openclaw")
messages = [
{
"role": "system",
"content": (
"You are a function calling AI model.\n\n"
"<tools>\n"
'{"type": "function", "function": {"name": "get_weather", '
'"description": "Get current weather", "parameters": {"type": "object", '
'"properties": {"location": {"type": "string"}}, "required": ["location"]}}}\n'
"</tools>\n\n"
"For each function call return a json object within <tool_call> </tool_call> tags."
),
},
{"role": "user", "content": "What is the weather in San Francisco?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=256, verbose=False)
print(response)
# <tool_call>
# {"name": "get_weather", "arguments": {"location": "San Francisco"}}
# </tool_call>
```
## Training Details
**Datasets:**
- [NousResearch/hermes-function-calling-v1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1) — 1,883 examples, already in `<tool_call>` format
- [glaiveai/glaive-function-calling-v2](https://huggingface.co/datasets/glaiveai/glaive-function-calling-v2) — 55,000 examples converted from `<functioncall>` format
**Preprocessing:**
- Conversations truncated to the first tool-call turn (system + user + assistant) to ensure tool call output is always within the sequence budget
- `<functioncall>``<tool_call>` normalization including balanced-brace extraction and single-quoted argument repair
- 90/5/5 train/valid/eval split
**LoRA config:**
```yaml
fine_tune_type: lora
num_layers: 32 # all transformer layers
lora_parameters:
rank: 16
alpha: 32
dropout: 0.0
scale: 10.0
optimizer: adamw
learning_rate: 2e-4
batch_size: 4
iters: 600
max_seq_length: 2048
mask_prompt: true # loss only on assistant turns
grad_checkpoint: true
```
**Hardware:** Apple M4 Max (128 GB unified memory), ~34 minutes wall-clock
## Training Code
Training was performed using the [tool-tuner](https://github.com/sunkencity/tool-tuner) framework — an autoresearch-inspired autonomous LoRA experiment loop built with mlx-lm.
## License
Apache 2.0 — same as base model.