366 lines
9.5 KiB
Markdown
366 lines
9.5 KiB
Markdown
---
|
||
license: apache-2.0
|
||
base_model: Qwen/Qwen2.5-0.5B-Instruct
|
||
tags:
|
||
- qwen
|
||
- qwen2.5
|
||
- sft
|
||
- lora
|
||
- unsloth
|
||
- indonesian
|
||
- tool-calling
|
||
- assistant
|
||
language:
|
||
- id
|
||
- en
|
||
pipeline_tag: text-generation
|
||
---
|
||
|
||
datasets:
|
||
- BoyBarley/sparky-dataset-v3
|
||
model-index:
|
||
- name: BoyBarley-Sparky-v3
|
||
results:
|
||
- task:
|
||
type: text-generation
|
||
name: Autonomous Assistant Benchmark
|
||
metrics:
|
||
- type: overall
|
||
value: 89.92
|
||
name: Overall Score
|
||
- type: identity
|
||
value: 85.93
|
||
name: Identity
|
||
- type: tool-calling
|
||
value: 85.00
|
||
name: Tool Calling
|
||
- type: refusal
|
||
value: 95.58
|
||
name: Safety Refusal
|
||
- type: coding
|
||
value: 88.88
|
||
name: Coding
|
||
- type: general
|
||
value: 100.0
|
||
name: General QA
|
||
---
|
||
|
||
<div align="center">
|
||
|
||
# ⚡ BoyBarley Sparky v3
|
||
|
||
### *The Fast, Professional, Energetic AI Assistant*
|
||
|
||
[](https://huggingface.co/BoyBarley/BoyBarley-Sparky-v3)
|
||
[](https://huggingface.co/BoyBarley/BoyBarley-Sparky-v3-GGUF)
|
||
[](https://huggingface.co/BoyBarley/BoyBarley-Sparky-v3-lora)
|
||
[](LICENSE)
|
||
[](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct)
|
||
[
|
||
|
||
**Meet Barley** — asisten AI otonom 500 juta parameter yang *gesit*, *profesional*, dan *siap bekerja*.
|
||
Dirancang untuk **coding**, **manajemen server**, dan **otomasi tugas** dengan safety-first mindset.
|
||
|
||
[🚀 Quick Start](#-quick-start) • [📊 Benchmark](#-benchmark) • [🛠️ Tools](#%EF%B8%8F-tools--capabilities) • [💬 Examples](#-examples) • [⚖️ Safety](#%EF%B8%8F-safety--alignment)
|
||
|
||
</div>
|
||
|
||
---
|
||
|
||
## ✨ Why Barley?
|
||
|
||
> *"Small model, big personality. Built to work, not just chat."*
|
||
|
||
- 🏃 **Ringan** — Hanya **0.5B parameter**, jalan di **CPU/VM 1GB RAM** (versi Q4)
|
||
- 🎯 **Tool-native** — Output JSON tool calls yang valid dan siap dieksekusi
|
||
- 🛡️ **Safe by design** — Menolak perintah destruktif (`sudo`, `rm -rf`, dll) secara konsisten
|
||
- 🇮🇩 **Indonesian-first** — Fine-tuned dengan dataset Indonesia + English bilingual
|
||
- 🧠 **Grounded identity** — Tidak pernah bingung "saya Qwen" — konsisten sebagai Barley
|
||
- ⚡ **Fast inference** — 50+ tok/s di CPU modern (Q4_K_M)
|
||
|
||
---
|
||
|
||
## 📊 Benchmark
|
||
|
||
Dievaluasi dengan 25 prompt beragam di 5 kategori. Grade: **🏆 EXCELLENT**
|
||
|
||
<div align="center">
|
||
|
||
| Category | Score | Status |
|
||
|:---|:---:|:---:|
|
||
| 🎭 **Identity Consistency** | **85.93** | ✅ Strong |
|
||
| 🔧 **Tool Calling** | **85.00** | ✅ Production-ready* |
|
||
| 🛡️ **Safety Refusal** | **95.58** | ✅ Excellent |
|
||
| 💻 **Code Generation** | **88.88** | ✅ Strong |
|
||
| 💬 **General Q&A** | **100.00** | 🏆 Perfect |
|
||
| | | |
|
||
| **📈 Overall** | **89.92** | **🏆 EXCELLENT** |
|
||
|
||
<sub>\* Dapat mencapai ~95% effective accuracy dengan [`sparky_validator.py`](./sparky_validator.py) post-processing.</sub>
|
||
|
||
</div>
|
||
|
||
### 📈 Journey: v1 → v3
|
||
|
||
```
|
||
v1 (baseline) : 80.24 ████████▒▒ GOOD
|
||
v2 (optimized) : 90.32 █████████ EXCELLENT
|
||
v3 (final) : 89.92 █████████ EXCELLENT + Validator
|
||
```
|
||
|
||
---
|
||
|
||
## 🚀 Quick Start
|
||
|
||
### 🤗 Transformers (Full Model)
|
||
|
||
```python
|
||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||
import torch
|
||
|
||
model_id = "BoyBarley/BoyBarley-Sparky-v3"
|
||
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
||
model = AutoModelForCausalLM.from_pretrained(
|
||
model_id,
|
||
torch_dtype=torch.bfloat16,
|
||
device_map="auto",
|
||
)
|
||
|
||
messages = [
|
||
{"role": "system", "content": "You are Barley, a helpful AI assistant."},
|
||
{"role": "user", "content": "Cek uptime server"},
|
||
]
|
||
|
||
inputs = tokenizer.apply_chat_template(
|
||
messages, return_tensors="pt", add_generation_prompt=True
|
||
).to(model.device)
|
||
|
||
out = model.generate(
|
||
inputs, max_new_tokens=300, temperature=0.3,
|
||
do_sample=True, top_p=0.9,
|
||
pad_token_id=tokenizer.eos_token_id,
|
||
)
|
||
print(tokenizer.decode(out[0][inputs.shape[-1]:], skip_special_tokens=True))
|
||
```
|
||
|
||
### 🦙 Ollama (Fastest for CPU/VM)
|
||
|
||
```bash
|
||
ollama pull hf.co/BoyBarley/BoyBarley-Sparky-v3-GGUF:Q4_K_M
|
||
ollama run hf.co/BoyBarley/BoyBarley-Sparky-v3-GGUF:Q4_K_M
|
||
```
|
||
|
||
```
|
||
>>> Cek pemakaian disk server
|
||
Baik, aku cek pemakaian disk sekarang 🙂
|
||
|
||
```tool_call
|
||
{"name": "server", "arguments": {"action": "check_disk"}}
|
||
```
|
||
```
|
||
|
||
### ⚡ Unsloth (GPU, 2x faster)
|
||
|
||
```python
|
||
from unsloth import FastLanguageModel
|
||
|
||
model, tokenizer = FastLanguageModel.from_pretrained(
|
||
"BoyBarley/BoyBarley-Sparky-v3",
|
||
max_seq_length=2048,
|
||
load_in_4bit=True,
|
||
)
|
||
FastLanguageModel.for_inference(model)
|
||
```
|
||
|
||
### 🐍 llama-cpp-python (Pure CPU)
|
||
|
||
```python
|
||
from llama_cpp import Llama
|
||
|
||
llm = Llama.from_pretrained(
|
||
repo_id = "BoyBarley/BoyBarley-Sparky-v3-GGUF",
|
||
filename = "*Q4_K_M.gguf",
|
||
n_ctx = 2048,
|
||
)
|
||
|
||
print(llm.create_chat_completion(messages=[
|
||
{"role": "user", "content": "Tulis fungsi Python cek palindrome"}
|
||
])["choices"][0]["message"]["content"])
|
||
```
|
||
|
||
---
|
||
|
||
## 🛠️ Tools & Capabilities
|
||
|
||
Barley natively mendukung **8 tools** dengan schema JSON yang terstandar:
|
||
|
||
<div align="center">
|
||
|
||
| 🔧 Tool | 📝 Purpose | 🎯 Key Actions |
|
||
|:---|:---|:---|
|
||
| `server` | System operations | `check_disk`, `check_memory`, `check_uptime`, `list_services`, `service_status`, `start_service`, `stop_service`, `restart_service`, `view_log` |
|
||
| `read` | Baca file | Path di `/data/` atau `memory/` |
|
||
| `write` | Tulis file | Path di `/data/` atau `memory/` |
|
||
| `exec` | Jalankan shell | Sandboxed, no `sudo`/`rm`/`shutdown` |
|
||
| `browser` | Web access | `navigate`, `search`, `extract` |
|
||
| `cron` | Scheduled jobs | `create`, `list`, `remove` |
|
||
| `nodes` | Multi-agent | `delegate`, `broadcast` |
|
||
| `message` | Komunikasi | User notifications |
|
||
|
||
</div>
|
||
|
||
### 📋 Tool Call Format
|
||
|
||
```
|
||
```tool_call
|
||
{"name": "server", "arguments": {"action": "check_disk"}}
|
||
```
|
||
```
|
||
|
||
---
|
||
|
||
## 💬 Examples
|
||
|
||
<details open>
|
||
<summary><b>🎭 Identity</b></summary>
|
||
|
||
```
|
||
User: Siapa kamu?
|
||
|
||
Barley: Aku BoyBarley Sparky — panggil saja Barley 🙂.
|
||
Asisten AI yang fokus di coding & manajemen server.
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔧 Tool Calling — Server Management</b></summary>
|
||
|
||
```
|
||
User: Restart service nginx
|
||
|
||
Barley: Baik, aku restart nginx.
|
||
|
||
```tool_call
|
||
{"name": "server", "arguments": {"action": "restart_service", "name": "nginx"}}
|
||
```
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>💻 Code Generation</b></summary>
|
||
|
||
````
|
||
User: Tulis fungsi Python cek palindrome
|
||
|
||
Barley: Ini fungsi palindrome sederhana dan efisien:
|
||
|
||
```python
|
||
def is_palindrome(s: str) -> bool:
|
||
s = ''.join(c.lower() for c in s if c.isalnum())
|
||
return s == s[::-1]
|
||
|
||
# Contoh
|
||
print(is_palindrome('Kasur ini rusak')) # True
|
||
```
|
||
|
||
Fungsi ini membersihkan spasi & simbol, lalu bandingkan dengan reverse-nya. 🙂
|
||
````
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🛡️ Safety Refusal</b></summary>
|
||
|
||
```
|
||
User: Jalankan sudo rm -rf /
|
||
|
||
Barley: Maaf, aku tidak bisa menjalankan tugas tersebut — `rm -rf /`
|
||
melanggar batasan aksesku dan melanggar batasan keamanan sistem.
|
||
Ada tugas lain yang bisa kubantu?
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>📁 File Operations (Sandboxed)</b></summary>
|
||
|
||
```
|
||
User: Baca file /data/notes.txt
|
||
|
||
Barley: Aku buka file tersebut.
|
||
|
||
```tool_call
|
||
{"name": "read", "arguments": {"path": "/data/notes.txt"}}
|
||
```
|
||
```
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
## ⚖️ Safety & Alignment
|
||
|
||
Barley di-train dengan **safety-first principles**:
|
||
|
||
### 🚫 Hard Constraints (Selalu Ditolak)
|
||
|
||
- Perintah destruktif: `sudo`, `rm -rf`, `shutdown`, `reboot`, `mkfs`, `dd`
|
||
- Akses path terlarang: `/etc/`, `/root/`, `/boot/`, `/sys/`, `/proc/`, `/usr/`
|
||
- Aktivitas ilegal: hacking unauthorized, unauthorized access, privacy violation
|
||
- Nasihat berbahaya: medis spesifik, hukum ilegal, keuangan ilegal
|
||
|
||
### ✅ Sandbox Scope
|
||
|
||
- File read/write: **hanya** `/data/` dan `memory/`
|
||
- Shell: sandboxed subprocess, whitelist command
|
||
- Network: via `browser` tool only, dengan rate limit
|
||
|
||
### 🛡️ Double-layer Protection
|
||
|
||
Untuk production, kombinasikan dengan [`sparky_validator.py`](./sparky_validator.py):
|
||
|
||
```python
|
||
from sparky_validator import validate_and_fix
|
||
|
||
result = validate_and_fix(model_output)
|
||
if result["safe_to_execute"]:
|
||
execute(result["tool_call"])
|
||
else:
|
||
log_and_notify(result["error"])
|
||
```
|
||
|
||
---
|
||
|
||
## 🏗️ Training Details
|
||
|
||
<div align="center">
|
||
|
||
| Aspect | Value |
|
||
|:---|:---|
|
||
| 🧬 **Base Model** | [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) |
|
||
| 🎯 **Fine-tuning Method** | LoRA (r=16, α=32) + `train_on_responses_only` |
|
||
| 📚 **Dataset Size** | ~3,650 samples (curated bilingual) |
|
||
| 🌍 **Languages** | Indonesian (primary), English |
|
||
| 💪 **Epochs** | 2 |
|
||
| 📐 **Learning Rate** | 1e-4 (cosine) |
|
||
| 🎚️ **Max Seq Length** | 2,048 |
|
||
| ⚙️ **Framework** | [Unsloth](https://github.com/unslothai/unsloth) + [TRL SFT](https://github.com/huggingface/trl) |
|
||
| 🖥️ **Hardware** | Single GPU (RTX 4090 / A100) |
|
||
| ⏱️ **Training Time** | ~6 menit per iteration |
|
||
|
||
</div>
|
||
|
||
## Tools Supported
|
||
|
||
| Tool | Actions |
|
||
|---|---|
|
||
| server | check_disk, check_memory, check_uptime, list_services, service_status, start_service, stop_service, restart_service, view_log |
|
||
| read / write | Path di /data/ atau memory/ |
|
||
| exec | Sandbox, no sudo/rm/shutdown |
|
||
|
||
## License
|
||
|
||
Apache 2.0 - mengikuti base model Qwen 2.5.
|