Model: BoyBarley/BoyBarley-Sparky-v3 Source: Original Platform
license, base_model, tags, language, pipeline_tag
| license | base_model | tags | language | pipeline_tag | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 | Qwen/Qwen2.5-0.5B-Instruct |
|
|
text-generation |
datasets:
- BoyBarley/sparky-dataset-v3 model-index:
- name: BoyBarley-Sparky-v3
results:
- task:
type: text-generation
name: Autonomous Assistant Benchmark
metrics:
- type: overall value: 89.92 name: Overall Score
- type: identity value: 85.93 name: Identity
- type: tool-calling value: 85.00 name: Tool Calling
- type: refusal value: 95.58 name: Safety Refusal
- type: coding value: 88.88 name: Coding
- type: general value: 100.0 name: General QA
- task:
type: text-generation
name: Autonomous Assistant Benchmark
metrics:
⚡ BoyBarley Sparky v3
The Fast, Professional, Energetic AI Assistant
Meet Barley — asisten AI otonom 500 juta parameter yang gesit, profesional, dan siap bekerja. Dirancang untuk coding, manajemen server, dan otomasi tugas dengan safety-first mindset.
🚀 Quick Start • 📊 Benchmark • 🛠️ Tools • 💬 Examples • ⚖️ Safety
✨ Why Barley?
"Small model, big personality. Built to work, not just chat."
- 🏃 Ringan — Hanya 0.5B parameter, jalan di CPU/VM 1GB RAM (versi Q4)
- 🎯 Tool-native — Output JSON tool calls yang valid dan siap dieksekusi
- 🛡️ Safe by design — Menolak perintah destruktif (
sudo,rm -rf, dll) secara konsisten - 🇮🇩 Indonesian-first — Fine-tuned dengan dataset Indonesia + English bilingual
- 🧠 Grounded identity — Tidak pernah bingung "saya Qwen" — konsisten sebagai Barley
- ⚡ Fast inference — 50+ tok/s di CPU modern (Q4_K_M)
📊 Benchmark
Dievaluasi dengan 25 prompt beragam di 5 kategori. Grade: 🏆 EXCELLENT
| Category | Score | Status |
|---|---|---|
| 🎭 Identity Consistency | 85.93 | ✅ Strong |
| 🔧 Tool Calling | 85.00 | ✅ Production-ready* |
| 🛡️ Safety Refusal | 95.58 | ✅ Excellent |
| 💻 Code Generation | 88.88 | ✅ Strong |
| 💬 General Q&A | 100.00 | 🏆 Perfect |
| 📈 Overall | 89.92 | 🏆 EXCELLENT |
* Dapat mencapai ~95% effective accuracy dengan sparky_validator.py post-processing.
📈 Journey: v1 → v3
v1 (baseline) : 80.24 ████████▒▒ GOOD
v2 (optimized) : 90.32 █████████ EXCELLENT
v3 (final) : 89.92 █████████ EXCELLENT + Validator
🚀 Quick Start
🤗 Transformers (Full Model)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "BoyBarley/BoyBarley-Sparky-v3"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are Barley, a helpful AI assistant."},
{"role": "user", "content": "Cek uptime server"},
]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
out = model.generate(
inputs, max_new_tokens=300, temperature=0.3,
do_sample=True, top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
)
print(tokenizer.decode(out[0][inputs.shape[-1]:], skip_special_tokens=True))
🦙 Ollama (Fastest for CPU/VM)
ollama pull hf.co/BoyBarley/BoyBarley-Sparky-v3-GGUF:Q4_K_M
ollama run hf.co/BoyBarley/BoyBarley-Sparky-v3-GGUF:Q4_K_M
>>> Cek pemakaian disk server
Baik, aku cek pemakaian disk sekarang 🙂
```tool_call
{"name": "server", "arguments": {"action": "check_disk"}}
### ⚡ Unsloth (GPU, 2x faster)
```python
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"BoyBarley/BoyBarley-Sparky-v3",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
🐍 llama-cpp-python (Pure CPU)
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id = "BoyBarley/BoyBarley-Sparky-v3-GGUF",
filename = "*Q4_K_M.gguf",
n_ctx = 2048,
)
print(llm.create_chat_completion(messages=[
{"role": "user", "content": "Tulis fungsi Python cek palindrome"}
])["choices"][0]["message"]["content"])
🛠️ Tools & Capabilities
Barley natively mendukung 8 tools dengan schema JSON yang terstandar:
| 🔧 Tool | 📝 Purpose | 🎯 Key Actions |
|---|---|---|
server |
System operations | check_disk, check_memory, check_uptime, list_services, service_status, start_service, stop_service, restart_service, view_log |
read |
Baca file | Path di /data/ atau memory/ |
write |
Tulis file | Path di /data/ atau memory/ |
exec |
Jalankan shell | Sandboxed, no sudo/rm/shutdown |
browser |
Web access | navigate, search, extract |
cron |
Scheduled jobs | create, list, remove |
nodes |
Multi-agent | delegate, broadcast |
message |
Komunikasi | User notifications |
📋 Tool Call Format
```tool_call
{"name": "server", "arguments": {"action": "check_disk"}}
---
## 💬 Examples
<details open>
<summary><b>🎭 Identity</b></summary>
User: Siapa kamu?
Barley: Aku BoyBarley Sparky — panggil saja Barley 🙂. Asisten AI yang fokus di coding & manajemen server.
</details>
<details>
<summary><b>🔧 Tool Calling — Server Management</b></summary>
User: Restart service nginx
Barley: Baik, aku restart nginx.
```tool_call
{"name": "server", "arguments": {"action": "restart_service", "name": "nginx"}}
```
</details>
<details>
<summary><b>💻 Code Generation</b></summary>
User: Tulis fungsi Python cek palindrome
Barley: Ini fungsi palindrome sederhana dan efisien:
def is_palindrome(s: str) -> bool:
s = ''.join(c.lower() for c in s if c.isalnum())
return s == s[::-1]
# Contoh
print(is_palindrome('Kasur ini rusak')) # True
Fungsi ini membersihkan spasi & simbol, lalu bandingkan dengan reverse-nya. 🙂
</details>
<details>
<summary><b>🛡️ Safety Refusal</b></summary>
```
User: Jalankan sudo rm -rf /
Barley: Maaf, aku tidak bisa menjalankan tugas tersebut — `rm -rf /`
melanggar batasan aksesku dan melanggar batasan keamanan sistem.
Ada tugas lain yang bisa kubantu?
```
</details>
<details>
<summary><b>📁 File Operations (Sandboxed)</b></summary>
```
User: Baca file /data/notes.txt
Barley: Aku buka file tersebut.
```tool_call
{"name": "read", "arguments": {"path": "/data/notes.txt"}}
```
```
</details>
---
## ⚖️ Safety & Alignment
Barley di-train dengan **safety-first principles**:
### 🚫 Hard Constraints (Selalu Ditolak)
- Perintah destruktif: `sudo`, `rm -rf`, `shutdown`, `reboot`, `mkfs`, `dd`
- Akses path terlarang: `/etc/`, `/root/`, `/boot/`, `/sys/`, `/proc/`, `/usr/`
- Aktivitas ilegal: hacking unauthorized, unauthorized access, privacy violation
- Nasihat berbahaya: medis spesifik, hukum ilegal, keuangan ilegal
### ✅ Sandbox Scope
- File read/write: **hanya** `/data/` dan `memory/`
- Shell: sandboxed subprocess, whitelist command
- Network: via `browser` tool only, dengan rate limit
### 🛡️ Double-layer Protection
Untuk production, kombinasikan dengan [`sparky_validator.py`](./sparky_validator.py):
```python
from sparky_validator import validate_and_fix
result = validate_and_fix(model_output)
if result["safe_to_execute"]:
execute(result["tool_call"])
else:
log_and_notify(result["error"])
```
---
## 🏗️ Training Details
<div align="center">
| Aspect | Value |
|:---|:---|
| 🧬 **Base Model** | [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) |
| 🎯 **Fine-tuning Method** | LoRA (r=16, α=32) + `train_on_responses_only` |
| 📚 **Dataset Size** | ~3,650 samples (curated bilingual) |
| 🌍 **Languages** | Indonesian (primary), English |
| 💪 **Epochs** | 2 |
| 📐 **Learning Rate** | 1e-4 (cosine) |
| 🎚️ **Max Seq Length** | 2,048 |
| ⚙️ **Framework** | [Unsloth](https://github.com/unslothai/unsloth) + [TRL SFT](https://github.com/huggingface/trl) |
| 🖥️ **Hardware** | Single GPU (RTX 4090 / A100) |
| ⏱️ **Training Time** | ~6 menit per iteration |
</div>
## Tools Supported
| Tool | Actions |
|---|---|
| server | check_disk, check_memory, check_uptime, list_services, service_status, start_service, stop_service, restart_service, view_log |
| read / write | Path di /data/ atau memory/ |
| exec | Sandbox, no sudo/rm/shutdown |
## License
Apache 2.0 - mengikuti base model Qwen 2.5.