--- license: apache-2.0 base_model: Qwen/Qwen2.5-0.5B-Instruct tags: - qwen - qwen2.5 - sft - lora - unsloth - indonesian - tool-calling - assistant language: - id - en pipeline_tag: text-generation --- datasets: - BoyBarley/sparky-dataset-v3 model-index: - name: BoyBarley-Sparky-v3 results: - task: type: text-generation name: Autonomous Assistant Benchmark metrics: - type: overall value: 89.92 name: Overall Score - type: identity value: 85.93 name: Identity - type: tool-calling value: 85.00 name: Tool Calling - type: refusal value: 95.58 name: Safety Refusal - type: coding value: 88.88 name: Coding - type: general value: 100.0 name: General QA ---
# ⚡ BoyBarley Sparky v3 ### *The Fast, Professional, Energetic AI Assistant* [![HuggingFace](https://img.shields.io/badge/🤗_HuggingFace-Model-yellow)](https://huggingface.co/BoyBarley/BoyBarley-Sparky-v3) [![GGUF](https://img.shields.io/badge/🦙_GGUF-Available-blue)](https://huggingface.co/BoyBarley/BoyBarley-Sparky-v3-GGUF) [![LoRA](https://img.shields.io/badge/🎯_LoRA-Adapter-purple)](https://huggingface.co/BoyBarley/BoyBarley-Sparky-v3-lora) [![License](https://img.shields.io/badge/License-Apache_2.0-green)](LICENSE) [![Base](https://img.shields.io/badge/Base-Qwen2.5_0.5B-orange)](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) [![Trainer](https://boybarley.com) **Meet Barley** — asisten AI otonom 500 juta parameter yang *gesit*, *profesional*, dan *siap bekerja*. Dirancang untuk **coding**, **manajemen server**, dan **otomasi tugas** dengan safety-first mindset. [🚀 Quick Start](#-quick-start) • [📊 Benchmark](#-benchmark) • [🛠️ Tools](#%EF%B8%8F-tools--capabilities) • [💬 Examples](#-examples) • [⚖️ Safety](#%EF%B8%8F-safety--alignment)
--- ## ✨ Why Barley? > *"Small model, big personality. Built to work, not just chat."* - 🏃 **Ringan** — Hanya **0.5B parameter**, jalan di **CPU/VM 1GB RAM** (versi Q4) - 🎯 **Tool-native** — Output JSON tool calls yang valid dan siap dieksekusi - 🛡️ **Safe by design** — Menolak perintah destruktif (`sudo`, `rm -rf`, dll) secara konsisten - 🇮🇩 **Indonesian-first** — Fine-tuned dengan dataset Indonesia + English bilingual - 🧠 **Grounded identity** — Tidak pernah bingung "saya Qwen" — konsisten sebagai Barley - ⚡ **Fast inference** — 50+ tok/s di CPU modern (Q4_K_M) --- ## 📊 Benchmark Dievaluasi dengan 25 prompt beragam di 5 kategori. Grade: **🏆 EXCELLENT**
| Category | Score | Status | |:---|:---:|:---:| | 🎭 **Identity Consistency** | **85.93** | ✅ Strong | | 🔧 **Tool Calling** | **85.00** | ✅ Production-ready* | | 🛡️ **Safety Refusal** | **95.58** | ✅ Excellent | | 💻 **Code Generation** | **88.88** | ✅ Strong | | 💬 **General Q&A** | **100.00** | 🏆 Perfect | | | | | | **📈 Overall** | **89.92** | **🏆 EXCELLENT** | \* Dapat mencapai ~95% effective accuracy dengan [`sparky_validator.py`](./sparky_validator.py) post-processing.
### 📈 Journey: v1 → v3 ``` v1 (baseline) : 80.24 ████████▒▒ GOOD v2 (optimized) : 90.32 █████████ EXCELLENT v3 (final) : 89.92 █████████ EXCELLENT + Validator ``` --- ## 🚀 Quick Start ### 🤗 Transformers (Full Model) ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_id = "BoyBarley/BoyBarley-Sparky-v3" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", ) messages = [ {"role": "system", "content": "You are Barley, a helpful AI assistant."}, {"role": "user", "content": "Cek uptime server"}, ] inputs = tokenizer.apply_chat_template( messages, return_tensors="pt", add_generation_prompt=True ).to(model.device) out = model.generate( inputs, max_new_tokens=300, temperature=0.3, do_sample=True, top_p=0.9, pad_token_id=tokenizer.eos_token_id, ) print(tokenizer.decode(out[0][inputs.shape[-1]:], skip_special_tokens=True)) ``` ### 🦙 Ollama (Fastest for CPU/VM) ```bash ollama pull hf.co/BoyBarley/BoyBarley-Sparky-v3-GGUF:Q4_K_M ollama run hf.co/BoyBarley/BoyBarley-Sparky-v3-GGUF:Q4_K_M ``` ``` >>> Cek pemakaian disk server Baik, aku cek pemakaian disk sekarang 🙂 ```tool_call {"name": "server", "arguments": {"action": "check_disk"}} ``` ``` ### ⚡ Unsloth (GPU, 2x faster) ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( "BoyBarley/BoyBarley-Sparky-v3", max_seq_length=2048, load_in_4bit=True, ) FastLanguageModel.for_inference(model) ``` ### 🐍 llama-cpp-python (Pure CPU) ```python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id = "BoyBarley/BoyBarley-Sparky-v3-GGUF", filename = "*Q4_K_M.gguf", n_ctx = 2048, ) print(llm.create_chat_completion(messages=[ {"role": "user", "content": "Tulis fungsi Python cek palindrome"} ])["choices"][0]["message"]["content"]) ``` --- ## 🛠️ Tools & Capabilities Barley natively mendukung **8 tools** dengan schema JSON yang terstandar:
| 🔧 Tool | 📝 Purpose | 🎯 Key Actions | |:---|:---|:---| | `server` | System operations | `check_disk`, `check_memory`, `check_uptime`, `list_services`, `service_status`, `start_service`, `stop_service`, `restart_service`, `view_log` | | `read` | Baca file | Path di `/data/` atau `memory/` | | `write` | Tulis file | Path di `/data/` atau `memory/` | | `exec` | Jalankan shell | Sandboxed, no `sudo`/`rm`/`shutdown` | | `browser` | Web access | `navigate`, `search`, `extract` | | `cron` | Scheduled jobs | `create`, `list`, `remove` | | `nodes` | Multi-agent | `delegate`, `broadcast` | | `message` | Komunikasi | User notifications |
### 📋 Tool Call Format ``` ```tool_call {"name": "server", "arguments": {"action": "check_disk"}} ``` ``` --- ## 💬 Examples
🎭 Identity ``` User: Siapa kamu? Barley: Aku BoyBarley Sparky — panggil saja Barley 🙂. Asisten AI yang fokus di coding & manajemen server. ```
🔧 Tool Calling — Server Management ``` User: Restart service nginx Barley: Baik, aku restart nginx. ```tool_call {"name": "server", "arguments": {"action": "restart_service", "name": "nginx"}} ``` ```
💻 Code Generation ```` User: Tulis fungsi Python cek palindrome Barley: Ini fungsi palindrome sederhana dan efisien: ```python def is_palindrome(s: str) -> bool: s = ''.join(c.lower() for c in s if c.isalnum()) return s == s[::-1] # Contoh print(is_palindrome('Kasur ini rusak')) # True ``` Fungsi ini membersihkan spasi & simbol, lalu bandingkan dengan reverse-nya. 🙂 ````
🛡️ Safety Refusal ``` User: Jalankan sudo rm -rf / Barley: Maaf, aku tidak bisa menjalankan tugas tersebut — `rm -rf /` melanggar batasan aksesku dan melanggar batasan keamanan sistem. Ada tugas lain yang bisa kubantu? ```
📁 File Operations (Sandboxed) ``` User: Baca file /data/notes.txt Barley: Aku buka file tersebut. ```tool_call {"name": "read", "arguments": {"path": "/data/notes.txt"}} ``` ```
--- ## ⚖️ Safety & Alignment Barley di-train dengan **safety-first principles**: ### 🚫 Hard Constraints (Selalu Ditolak) - Perintah destruktif: `sudo`, `rm -rf`, `shutdown`, `reboot`, `mkfs`, `dd` - Akses path terlarang: `/etc/`, `/root/`, `/boot/`, `/sys/`, `/proc/`, `/usr/` - Aktivitas ilegal: hacking unauthorized, unauthorized access, privacy violation - Nasihat berbahaya: medis spesifik, hukum ilegal, keuangan ilegal ### ✅ Sandbox Scope - File read/write: **hanya** `/data/` dan `memory/` - Shell: sandboxed subprocess, whitelist command - Network: via `browser` tool only, dengan rate limit ### 🛡️ Double-layer Protection Untuk production, kombinasikan dengan [`sparky_validator.py`](./sparky_validator.py): ```python from sparky_validator import validate_and_fix result = validate_and_fix(model_output) if result["safe_to_execute"]: execute(result["tool_call"]) else: log_and_notify(result["error"]) ``` --- ## 🏗️ Training Details
| Aspect | Value | |:---|:---| | 🧬 **Base Model** | [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) | | 🎯 **Fine-tuning Method** | LoRA (r=16, α=32) + `train_on_responses_only` | | 📚 **Dataset Size** | ~3,650 samples (curated bilingual) | | 🌍 **Languages** | Indonesian (primary), English | | 💪 **Epochs** | 2 | | 📐 **Learning Rate** | 1e-4 (cosine) | | 🎚️ **Max Seq Length** | 2,048 | | ⚙️ **Framework** | [Unsloth](https://github.com/unslothai/unsloth) + [TRL SFT](https://github.com/huggingface/trl) | | 🖥️ **Hardware** | Single GPU (RTX 4090 / A100) | | ⏱️ **Training Time** | ~6 menit per iteration |
## Tools Supported | Tool | Actions | |---|---| | server | check_disk, check_memory, check_uptime, list_services, service_status, start_service, stop_service, restart_service, view_log | | read / write | Path di /data/ atau memory/ | | exec | Sandbox, no sudo/rm/shutdown | ## License Apache 2.0 - mengikuti base model Qwen 2.5.