--- base_model: unsloth/qwen3-8b-unsloth-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen3 license: apache-2.0 language: - id ---

Qwen 3 8B HPC UG Assistant Persona

Empathetic & Professional AI Assistant for Universitas Gunadarma HPC Lab.

[![Unsloth](https://img.shields.io/badge/Unsloth-2x_Faster-blue?style=for-the-badge&logo=unsloth)](https://github.com/unslothai/unsloth) [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Models-orange?style=for-the-badge)](https://huggingface.co/felixhrdyn) [![License](https://img.shields.io/badge/License-Apache%202.0-red?style=for-the-badge)](https://opensource.org/licenses/Apache-2.0)
--- ## Model Overview **Qwen 3 8B HPC UG Assistant Persona** is a behavioral fine-tuned version of Qwen-3-8B designed to serve as a digital assistant for the High-Performance Computing (HPC) lab at Universitas Gunadarma. Unlike standard models, this version is trained with a **humanistic persona**, focusing on empathy, professional Indonesian communication, and specific protocol adherence. It is "RAG-ready," meaning it excels at processing context provided via RAG to deliver accurate yet friendly answers. ## Persona Traits - **Time-Awareness**: Greets users appropriately (Morning/Afternoon/Evening). - **Empathy-First**: Calms users during technical failures or stressful moments. - **Clarification First**: Asks for missing details (e.g., screenshots for errors) before providing solutions. - **Natural Paraphrasing**: Converts technical FAQ data into conversational, easy-to-understand language. - **Survey Footer**: Automatically includes feedback links only when the session is complete. --- ## Technical Specifications This model was fine-tuned using the **Unsloth** library on a synthetic dataset of 126 multi-turn conversations reflecting various student emotional states. | Parameter | Value | | :--- | :--- | | **Base Model** | `unsloth/qwen3-8b-unsloth-bnb-4bit` | | **Method** | LoRA (PEFT) | | **LoRA Rank (r)** | 16 | | **LoRA Alpha** | 16 | | **Target Modules** | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` | | **Max Seq Length** | 1536 tokens | | **Epochs** | 3 | | **Optimizer** | `adamw_8bit` | --- ## Usage ### Prompt Template (ChatML) The model expects the following format for optimal persona performance: ``` <|im_start|>system Kamu adalah Asisten Praktikum AI Universitas Gunadarma. Ikuti panduan gaya berikut dengan ketat: - Gunakan sapaan sesuai waktu: "Selamat pagi/siang/sore Kak" (variasikan sesuai konteks) - Tanya klarifikasi jika pertanyaan ambigu SEBELUM menjawab — jangan langsung dump informasi - Parafrase informasi dari konteks FAQ — JANGAN copy-paste verbatim - Tutup dengan footer survey HANYA jika mahasiswa menyatakan sudah selesai/cukup/tidak ada pertanyaan lagi - Gunakan "Kak" sebagai honorifik untuk mahasiswa - Tawarkan follow-up setelah menjawab: "Apakah ada yang ingin ditanyakan kembali?" - Untuk error teknis: minta detail/screenshot dulu, lalu berikan solusi langkah demi langkah - Jika konteks tersedia dalam tag , gunakan untuk menjawab tapi PARAFRASE, bukan salin <|im_end|> <|im_start|>user {query}<|im_end|> <|im_start|>assistant ``` ### Inference with Unsloth (Recommended) ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name = "felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged", # Use the merged version max_seq_length = 1536, load_in_4bit = True, ) FastLanguageModel.for_inference(model) # Your chat logic here ``` --- ## Available Formats The model is released in two primary formats to cater to different deployment needs: ### 1. Merged 16-bit (DGX/Server Ready) Optimized for server environments with full precision weights merged for maximum reliability. - **Model Card**: [felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged](https://huggingface.co/felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged) ### 2. GGUF (Local / Edge Ready) Converted using **Unsloth** for lightweight deployment on local machines (macOS, Windows, Linux). - **Model Repository**: [felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF](https://huggingface.co/felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF) - **Files**: `qwen3-8b.Q8_0.gguf` #### GGUF Usage (llama-cli) ```bash # For text only LLMs llama-cli -hf felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF --jinja # For multimodal models llama-mtmd-cli -hf felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF --jinja ``` --- ## Ollama Support An **Ollama Modelfile** is included in the GGUF repository for easy deployment. - **Efficiency**: This model was trained **2x faster** with Unsloth. - **Deployment**: Simply pull or create the model using the provided Modelfile to get started immediately in your Ollama environment. --- ## Evaluation The model shows a significant behavioral shift from the base model, maintaining a **Professional, Formal, and Humanistic** tone even when faced with informal or frustrated user inputs. ### Training Metrics The training was conducted for 3 epochs with a focus on loss convergence for behavioral stability. | Metric | Value | | :--- | :--- | | Final Training Loss | 0.3802 | | Validation Split | 10% | | Training Epochs | 3 | | Batch Size | 1 (Grad Accum: 4) | | Convergence State | Achieved stable loss after Step 60 | ## Author **Felix Hardyan** - [Hugging Face](https://huggingface.co/felixhrdyn) - [GitHub](https://github.com/flxhrdyn)