felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged

Go to file

ModelHub XC 5a203a5093 初始化项目，由ModelHub XC社区提供模型

Model: felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged
Source: Original Platform

2026-05-28 18:22:16 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

model-00001-of-00004.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

model-00002-of-00004.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

model-00003-of-00004.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

model-00004-of-00004.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-28 18:22:16 +08:00

README.md

base_model, tags, license, language

base_model

Qwen 3 8B HPC UG Assistant Persona

Empathetic & Professional AI Assistant for Universitas Gunadarma HPC Lab.

Model Overview

Qwen 3 8B HPC UG Assistant Persona is a behavioral fine-tuned version of Qwen-3-8B designed to serve as a digital assistant for the High-Performance Computing (HPC) lab at Universitas Gunadarma.

Unlike standard models, this version is trained with a humanistic persona, focusing on empathy, professional Indonesian communication, and specific protocol adherence. It is "RAG-ready," meaning it excels at processing context provided via RAG to deliver accurate yet friendly answers.

Persona Traits

Time-Awareness: Greets users appropriately (Morning/Afternoon/Evening).
Empathy-First: Calms users during technical failures or stressful moments.
Clarification First: Asks for missing details (e.g., screenshots for errors) before providing solutions.
Natural Paraphrasing: Converts technical FAQ data into conversational, easy-to-understand language.
Survey Footer: Automatically includes feedback links only when the session is complete.

Technical Specifications

This model was fine-tuned using the Unsloth library on a synthetic dataset of 126 multi-turn conversations reflecting various student emotional states.

Parameter	Value
Base Model	`unsloth/qwen3-8b-unsloth-bnb-4bit`
Method	LoRA (PEFT)
LoRA Rank (r)	16
LoRA Alpha	16
Target Modules	`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`
Max Seq Length	1536 tokens
Epochs	3
Optimizer	`adamw_8bit`

Usage

Prompt Template (ChatML)

The model expects the following format for optimal persona performance:

<|im_start|>system
Kamu adalah Asisten Praktikum AI Universitas Gunadarma. Ikuti panduan gaya berikut dengan ketat:
- Gunakan sapaan sesuai waktu: "Selamat pagi/siang/sore Kak" (variasikan sesuai konteks)
- Tanya klarifikasi jika pertanyaan ambigu SEBELUM menjawab — jangan langsung dump informasi
- Parafrase informasi dari konteks FAQ — JANGAN copy-paste verbatim
- Tutup dengan footer survey HANYA jika mahasiswa menyatakan sudah selesai/cukup/tidak ada pertanyaan lagi
- Gunakan "Kak" sebagai honorifik untuk mahasiswa
- Tawarkan follow-up setelah menjawab: "Apakah ada yang ingin ditanyakan kembali?"
- Untuk error teknis: minta detail/screenshot dulu, lalu berikan solusi langkah demi langkah
- Jika konteks tersedia dalam tag <konteks>, gunakan untuk menjawab tapi PARAFRASE, bukan salin
<|im_end|>
<|im_start|>user
{query}<|im_end|>
<|im_start|>assistant

Inference with Unsloth (Recommended)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged", # Use the merged version
    max_seq_length = 1536,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

# Your chat logic here

Available Formats

The model is released in two primary formats to cater to different deployment needs:

1. Merged 16-bit (DGX/Server Ready)

Optimized for server environments with full precision weights merged for maximum reliability.

Model Card: felixhrdyn/Qwen3-8B-HPC-UG-Persona-Merged

2. GGUF (Local / Edge Ready)

Converted using Unsloth for lightweight deployment on local machines (macOS, Windows, Linux).

Model Repository: felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF
Files: qwen3-8b.Q8_0.gguf

GGUF Usage (llama-cli)

# For text only LLMs
llama-cli -hf felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF --jinja

# For multimodal models
llama-mtmd-cli -hf felixhrdyn/Qwen3-8B-HPC-UG-Persona-GGUF --jinja

Ollama Support

An Ollama Modelfile is included in the GGUF repository for easy deployment.

Efficiency: This model was trained 2x faster with Unsloth.
Deployment: Simply pull or create the model using the provided Modelfile to get started immediately in your Ollama environment.

Evaluation

The model shows a significant behavioral shift from the base model, maintaining a Professional, Formal, and Humanistic tone even when faced with informal or frustrated user inputs.

Training Metrics

The training was conducted for 3 epochs with a focus on loss convergence for behavioral stability.

Metric	Value
Final Training Loss	0.3802
Validation Split	10%
Training Epochs	3
Batch Size	1 (Grad Accum: 4)
Convergence State	Achieved stable loss after Step 60

Author

Felix Hardyan