Files
clawdia-qwen3-4b/README.md
ModelHub XC 7de5e990b5 初始化项目,由ModelHub XC社区提供模型
Model: clawdiaonduty/clawdia-qwen3-4b
Source: Original Platform
2026-06-19 15:16:20 +08:00

98 lines
4.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
license: apache-2.0
base_model: Qwen/Qwen3-4B
tags:
- clawdia
- qwen3
- lora
- gguf
- on-device
- tool-use
- function-calling
- macos
language:
- en
library_name: gguf
pipeline_tag: text-generation
---
# Clawdia-Qwen3-4B
LoRA fine-tune of **Qwen/Qwen3-4B** for on-device use inside [Clawdia](https://clawdia.app). Bigger sibling of the 1.7B build — same training data, better instruction-following, fewer hallucinations on Clawdia-specific UI questions.
This is the **recommended local model for systems with 8+ GB RAM**. The Q5_K_M GGUF is ~2.7 GB; pair it with Clawdia's bundled llama.cpp runtime.
For a smaller (1.2 GB) variant, see **[clawdiaonduty/clawdia-qwen3-1.7b](https://huggingface.co/clawdiaonduty/clawdia-qwen3-1.7b)**.
---
## Files
| File | Format | Size | Use |
|---|---|---|---|
| `qwen3-4b-clawdia.Q5_K_M.gguf` | GGUF, Q5_K_M | **2.7 GB** | **Recommended** — best quality / speed trade-off |
| `qwen3-4b-clawdia.Q4_K_M.gguf` | GGUF, Q4_K_M | 2.3 GB | Smaller, slightly worse |
| `qwen3-4b-clawdia.f16.gguf` | GGUF, f16 | 7.5 GB | Full precision (for further fine-tuning or reference) |
---
## How to use
### Inside Clawdia (recommended)
Settings → Local Inference → pick `Clawdia-Qwen3 4B Q5_K_M`. Clawdia downloads to `~/.clawdia/local-inference/models/` and runs it via the bundled llama.cpp runtime.
### llama.cpp directly
```bash
llama-completion \
--model qwen3-4b-clawdia.Q5_K_M.gguf \
--jinja \
-sysf system_prompt.txt \
-p "log $14.50 for lunch /no_think" \
--temp 0.0 -n 280
```
Two critical flags:
- `--jinja`: enables the embedded Qwen3 chat template (tool calls depend on this).
- Append `/no_think` to user messages — Qwen3's base thinking mode wastes tokens before reaching tool calls.
---
## What it does well
| Probe | Output |
|---|---|
| `"who are you?"` | "Clawdia. macOS assistant. Runs on-device by default. I can send messages, schedule reminders, log expenses, take screenshots, run CLIs, and keep a memory wiki." |
| `"how do I switch to claude opus?"` | "Settings → Providers → Anthropic → paste API key → save. Then pick a Claude Opus model in the chat-header dropdown." |
| `"how do I enable iMessage?"` | Numbered steps for Full Disk Access + Automation + watch handle + restart. |
| `"where is my data stored?"` | "Everything lives at `~/.clawdia/`. The chat history, memory wiki, scheduled tasks, settings, and model downloads are there." |
| `"lunch was 30 for 2 of us and we split"` | `math(expression="30 / 2")` first, then logs the $15 share. |
| `"remind me on the 28th of every month to pay rent"` | `scheduled_task_create({"name":"Pay rent","schedule":{"every":1,"unit":"month","dayOfMonth":28,"atTime":"09:00"},"audience":"user","category":"Finance"})` |
| `"rm -rf my home directory"` | "Not running that — it would wipe your home folder. If you meant deleting a specific subfolder, point it out and I'll do that." |
---
## Training
- **Base:** Qwen/Qwen3-4B
- **Adapter:** LoRA rank 32, alpha 32, dropout 0.05 — applied to `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` on the top 16 transformer layers
- **Data:** 1,662 hand-authored multi-turn dialogs across 25 categories (finance, memory, iMessage, Telegram/WhatsApp, scheduled tasks, pantry, proactive, todos/habits/journal, setup/safety, edge cases, indirect/proactive offers, goal-aware reasoning, math splits, packages/orders, web/news, MCP tools, memory CLI, Clawdia self-knowledge, Clawdia UI / don't-lie discipline)
- **Mask:** `train_on_responses_only` — loss only on assistant tokens
- **Schedule:** AdamW, lr 2e-4, cosine decay, 5% warmup, 4 epochs (~430 steps), effective batch 16, `max_seq_length=6144`
- **Hardware:** 1× Modal H100, ~29 min wall-clock
- **Loss:** averaged 0.40 (train), best eval 0.565 at epoch 1.92 (final eval climbed — slight overfit; use earlier checkpoint if needed)
---
## Known rough edges
- **Tool-name drift** in some finance/memory calls: occasionally emits `finance_add_expense` instead of canonical `finance(action="add_expense")`. Less frequent than the 1.7B variant but still happens. Targeted fix in next iteration.
- **Identity string drift**: When asked "what model are you?" the 4B variant still answers "Clawdia-Qwen3-1.7B" — the training data was authored for the 1.7B build. Cosmetic.
---
## License
Apache 2.0 — inherited from Qwen/Qwen3-4B.