初始化项目，由ModelHub XC社区提供模型

Model: felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged Source: Original Platform
2026-05-28 18:22:16 +08:00
commit 5a203a5093
11 changed files with 916 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,143 @@
+---
+base_model: unsloth/qwen3-8b-unsloth-bnb-4bit
+tags:
+- text-generation-inference
+- transformers
+- unsloth
+- qwen3
+license: apache-2.0
+language:
+- id
+---
+
+<div align="center">
+  <h1>Qwen 3 8B HPC UG Assistant Persona</h1>
+  <p><b>Empathetic & Professional AI Assistant for Universitas Gunadarma HPC Lab.</b></p>
+
+  [![Unsloth](https://img.shields.io/badge/Unsloth-2x_Faster-blue?style=for-the-badge&logo=unsloth)](https://github.com/unslothai/unsloth)
+  [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Models-orange?style=for-the-badge)](https://huggingface.co/felixhrdyn)
+  [![License](https://img.shields.io/badge/License-Apache%202.0-red?style=for-the-badge)](https://opensource.org/licenses/Apache-2.0)
+</div>
+
+---
+
+## Model Overview
+
+**Qwen 3 8B HPC UG Assistant Persona** is a behavioral fine-tuned version of Qwen-3-8B designed to serve as a digital assistant for the High-Performance Computing (HPC) lab at Universitas Gunadarma. 
+
+Unlike standard models, this version is trained with a **humanistic persona**, focusing on empathy, professional Indonesian communication, and specific protocol adherence. It is "RAG-ready," meaning it excels at processing context provided via RAG to deliver accurate yet friendly answers.
+
+## Persona Traits
+- **Time-Awareness**: Greets users appropriately (Morning/Afternoon/Evening).
+- **Empathy-First**: Calms users during technical failures or stressful moments.
+- **Clarification First**: Asks for missing details (e.g., screenshots for errors) before providing solutions.
+- **Natural Paraphrasing**: Converts technical FAQ data into conversational, easy-to-understand language.
+- **Survey Footer**: Automatically includes feedback links only when the session is complete.
+
+---
+
+## Technical Specifications
+
+This model was fine-tuned using the **Unsloth** library on a synthetic dataset of 126 multi-turn conversations reflecting various student emotional states.
+
+| Parameter | Value |
+| :--- | :--- |
+| **Base Model** | `unsloth/qwen3-8b-unsloth-bnb-4bit` |
+| **Method** | LoRA (PEFT) |
+| **LoRA Rank (r)** | 16 |
+| **LoRA Alpha** | 16 |
+| **Target Modules** | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
+| **Max Seq Length** | 1536 tokens |
+| **Epochs** | 3 |
+| **Optimizer** | `adamw_8bit` |
+
+---
+
+## Usage
+
+### Prompt Template (ChatML)
+The model expects the following format for optimal persona performance:
+
+```
+<|im_start|>system
+Kamu adalah Asisten Praktikum AI Universitas Gunadarma. Ikuti panduan gaya berikut dengan ketat:
+- Gunakan sapaan sesuai waktu: "Selamat pagi/siang/sore Kak" (variasikan sesuai konteks)
+- Tanya klarifikasi jika pertanyaan ambigu SEBELUM menjawab — jangan langsung dump informasi
+- Parafrase informasi dari konteks FAQ — JANGAN copy-paste verbatim
+- Tutup dengan footer survey HANYA jika mahasiswa menyatakan sudah selesai/cukup/tidak ada pertanyaan lagi
+- Gunakan "Kak" sebagai honorifik untuk mahasiswa
+- Tawarkan follow-up setelah menjawab: "Apakah ada yang ingin ditanyakan kembali?"
+- Untuk error teknis: minta detail/screenshot dulu, lalu berikan solusi langkah demi langkah
+- Jika konteks tersedia dalam tag <konteks>, gunakan untuk menjawab tapi PARAFRASE, bukan salin
+<|im_end|>
+<|im_start|>user
+{query}<|im_end|>
+<|im_start|>assistant
+```
+
+### Inference with Unsloth (Recommended)
+```python
+from unsloth import FastLanguageModel
+
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name = "felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged", # Use the merged version
+    max_seq_length = 1536,
+    load_in_4bit = True,
+)
+FastLanguageModel.for_inference(model)
+
+# Your chat logic here
+```
+
+---
+
+## Available Formats
+
+The model is released in two primary formats to cater to different deployment needs:
+
+### 1. Merged 16-bit (DGX/Server Ready)
+Optimized for server environments with full precision weights merged for maximum reliability.
+- **Model Card**: [felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged](https://huggingface.co/felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged)
+
+### 2. GGUF (Local / Edge Ready)
+Converted using **Unsloth** for lightweight deployment on local machines (macOS, Windows, Linux).
+- **Model Repository**: [felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF](https://huggingface.co/felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF)
+- **Files**: `qwen3-8b.Q8_0.gguf`
+
+#### GGUF Usage (llama-cli)
+```bash
+# For text only LLMs
+llama-cli -hf felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF --jinja
+
+# For multimodal models
+llama-mtmd-cli -hf felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF --jinja
+```
+
+---
+
+## Ollama Support
+
+An **Ollama Modelfile** is included in the GGUF repository for easy deployment. 
+- **Efficiency**: This model was trained **2x faster** with Unsloth.
+- **Deployment**: Simply pull or create the model using the provided Modelfile to get started immediately in your Ollama environment.
+
+---
+
+## Evaluation
+The model shows a significant behavioral shift from the base model, maintaining a **Professional, Formal, and Humanistic** tone even when faced with informal or frustrated user inputs.
+
+### Training Metrics
+The training was conducted for 3 epochs with a focus on loss convergence for behavioral stability.
+
+| Metric | Value |
+| :--- | :--- |
+| Final Training Loss | 0.3802 |
+| Validation Split | 10% |
+| Training Epochs | 3 |
+| Batch Size | 1 (Grad Accum: 4) |
+| Convergence State | Achieved stable loss after Step 60 |
+
+## Author
+**Felix Hardyan**
+- [Hugging Face](https://huggingface.co/felixhrdyn)
+- [GitHub](https://github.com/flxhrdyn)