289 lines
9.5 KiB
Markdown
289 lines
9.5 KiB
Markdown
---
|
||
language:
|
||
- ar
|
||
license: apache-2.0
|
||
library_name: transformers
|
||
base_model: Qwen/Qwen3-4B
|
||
datasets:
|
||
- NightPrince/islamic-arabic-qa
|
||
tags:
|
||
- arabic
|
||
- islamic
|
||
- fiqh
|
||
- fatwa
|
||
- qlora
|
||
- peft
|
||
- qwen3
|
||
- instruction-tuning
|
||
- conversational
|
||
pipeline_tag: text-generation
|
||
inference:
|
||
parameters:
|
||
max_new_tokens: 512
|
||
temperature: 0.3
|
||
do_sample: true
|
||
widget:
|
||
- text: "ما حكم زكاة الفطر وما مقدارها؟"
|
||
example_title: "زكاة الفطر"
|
||
- text: "ما الفرق بين الفرض والواجب عند الحنفية؟"
|
||
example_title: "الفرض والواجب"
|
||
- text: "ما حكم بيع العينة في الفقه الإسلامي؟"
|
||
example_title: "بيع العينة"
|
||
- text: "ما شروط صحة الصلاة؟"
|
||
example_title: "شروط الصلاة"
|
||
model-index:
|
||
- name: Qwen3-4B-Islamic-Arabic
|
||
results:
|
||
- task:
|
||
type: text-generation
|
||
name: Text Generation
|
||
dataset:
|
||
name: Islamic Arabic Q&A
|
||
type: NightPrince/islamic-arabic-qa
|
||
split: validation
|
||
metrics:
|
||
- name: Validation Loss
|
||
type: loss
|
||
value: 2.4094
|
||
verified: false
|
||
---
|
||
|
||
# Qwen3-4B-Islamic-Arabic
|
||
|
||
**Qwen3-4B fine-tuned on Islamic Arabic Q&A via QLoRA — merged FP16, ready for direct inference.**
|
||
|
||
This is the canonical, fully merged version of a Qwen3-4B model fine-tuned on 17,944 high-quality Islamic Arabic question-answer pairs spanning Fiqh, Fatwa, Aqeedah, Quran Sciences, and Islamic Finance. The LoRA adapter has been merged into the base weights and saved in FP16; no additional adapter loading is required.
|
||
|
||
Trained by **[Yahya Alnwsany (NightPrince)](https://huggingface.co/NightPrince)** — 2026-05-05.
|
||
|
||
---
|
||
|
||
## Model Variants
|
||
|
||
| Variant | Repo | Description |
|
||
|---|---|---|
|
||
| **Merged FP16** (this model) | [NightPrince/Qwen3-4B-Islamic-Arabic](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic) | Canonical merged model, FP16, ~7.6 GB — drop-in for transformers or vLLM |
|
||
| **LoRA Adapter** | [NightPrince/Qwen3-4B-Islamic-Arabic-LoRA](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-LoRA) | PEFT adapter only, 264 MB — apply on top of `Qwen/Qwen3-4B` |
|
||
| **INT4 Quantized** | [NightPrince/Qwen3-4B-Islamic-Arabic-INT4](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-INT4) | W4A16 compressed-tensors for fast vLLM serving, 2.5 GB |
|
||
| **MLX 4-bit** | [NightPrince/Qwen3-4B-Islamic-Arabic-mlx-4Bit](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-mlx-4Bit) | Apple Silicon / MLX — native Mac inference, 4-bit quantized |
|
||
| **GGUF** | [NightPrince/Qwen3-4B-Islamic-Arabic-GGUF](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-GGUF) | llama.cpp / Ollama / LM Studio — Q4_K_M (2.3 GB), Q8_0 (4.0 GB), F16 (7.5 GB) |
|
||
| **Dataset** | [NightPrince/islamic-arabic-qa](https://huggingface.co/datasets/NightPrince/islamic-arabic-qa) | 17,944 train / 2,101 val / 1,042 test — Islamic Arabic Q&A pairs |
|
||
|
||
---
|
||
|
||
## Training Metrics
|
||
|
||
### Loss Curve
|
||
|
||
| Checkpoint | Train Loss | Eval Loss |
|
||
|---|---|---|
|
||
| Step 0 (init) | — | — |
|
||
| Step 843 (final) | **1.8918** | **2.4094** (best) |
|
||
|
||
### Token Accuracy
|
||
|
||
| Phase | Token Accuracy |
|
||
|---|---|
|
||
| Early training | ~50% |
|
||
| End of training | ~60% |
|
||
|
||
> **MCQ evaluation coming soon** — a multiple-choice benchmark (Islamics domain) is prepared but requires serving the model via vLLM. Results will be posted here once available.
|
||
|
||
---
|
||
|
||
## Usage
|
||
|
||
### Transformers Inference
|
||
|
||
```python
|
||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||
import torch
|
||
|
||
model_id = "NightPrince/Qwen3-4B-Islamic-Arabic"
|
||
|
||
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
||
model = AutoModelForCausalLM.from_pretrained(
|
||
model_id,
|
||
torch_dtype=torch.float16,
|
||
device_map="auto",
|
||
)
|
||
|
||
SYSTEM_PROMPT = (
|
||
"أنت مساعد عالم إسلامي متخصص. "
|
||
"أجب على الأسئلة بدقة استناداً إلى القرآن الكريم والسنة النبوية والفقه الإسلامي الكلاسيكي. "
|
||
"استشهد بالمصادر حيثما أمكن. كن موجزاً لكن شاملاً."
|
||
)
|
||
|
||
messages = [
|
||
{"role": "system", "content": SYSTEM_PROMPT},
|
||
{"role": "user", "content": "ما حكم الزكاة على المال المدخر؟"},
|
||
]
|
||
|
||
text = tokenizer.apply_chat_template(
|
||
messages,
|
||
tokenize=False,
|
||
add_generation_prompt=True,
|
||
)
|
||
inputs = tokenizer(text, return_tensors="pt").to(model.device)
|
||
|
||
with torch.no_grad():
|
||
outputs = model.generate(
|
||
**inputs,
|
||
max_new_tokens=512,
|
||
temperature=0.7,
|
||
top_p=0.9,
|
||
do_sample=True,
|
||
)
|
||
|
||
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
|
||
print(response)
|
||
```
|
||
|
||
### vLLM Serving
|
||
|
||
The merged FP16 model is ~7.6 GB. Use at least `tensor_parallel_size=2` on 11 GB GPUs (e.g., RTX 2080 Ti), or a single 24 GB+ GPU.
|
||
|
||
```bash
|
||
# Install vLLM if needed
|
||
pip install vllm
|
||
|
||
# Serve with tensor parallelism across 2 GPUs
|
||
vllm serve NightPrince/Qwen3-4B-Islamic-Arabic \
|
||
--dtype float16 \
|
||
--tensor-parallel-size 2 \
|
||
--max-model-len 4096 \
|
||
--port 8000
|
||
```
|
||
|
||
Query the running server:
|
||
|
||
```python
|
||
from openai import OpenAI
|
||
|
||
client = OpenAI(base_url="http://localhost:8000/v1", api_key="token-abc123")
|
||
|
||
SYSTEM_PROMPT = (
|
||
"أنت مساعد عالم إسلامي متخصص. "
|
||
"أجب على الأسئلة بدقة استناداً إلى القرآن الكريم والسنة النبوية والفقه الإسلامي الكلاسيكي. "
|
||
"استشهد بالمصادر حيثما أمكن. كن موجزاً لكن شاملاً."
|
||
)
|
||
|
||
response = client.chat.completions.create(
|
||
model="NightPrince/Qwen3-4B-Islamic-Arabic",
|
||
messages=[
|
||
{"role": "system", "content": SYSTEM_PROMPT},
|
||
{"role": "user", "content": "ما حكم الزكاة على المال المدخر؟"},
|
||
],
|
||
max_tokens=512,
|
||
temperature=0.7,
|
||
)
|
||
print(response.choices[0].message.content)
|
||
```
|
||
|
||
> **Prefer lower memory?** Use the [INT4 quantized variant](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-INT4) (2.5 GB) for vLLM or the [GGUF variant](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-GGUF) for llama.cpp / Ollama.
|
||
|
||
---
|
||
|
||
## Training Details
|
||
|
||
### Dataset
|
||
|
||
| Property | Value |
|
||
|---|---|
|
||
| Dataset | [NightPrince/islamic-arabic-qa](https://huggingface.co/datasets/NightPrince/islamic-arabic-qa) |
|
||
| Train split | 17,944 samples |
|
||
| Validation split | 2,101 samples |
|
||
| Test split | 1,042 samples |
|
||
| Language | Arabic (Modern Standard + Classical) |
|
||
| Domains | Fiqh, Fatwa, Aqeedah, Quran Sciences, Islamic Finance |
|
||
| Quality filter | Applied — deduplication, length filtering, domain relevance scoring |
|
||
| Format | Instruction-following (system / user / assistant) |
|
||
|
||
### Hyperparameters
|
||
|
||
| Hyperparameter | Value |
|
||
|---|---|
|
||
| Epochs | 3 |
|
||
| Per-device batch size | 1 |
|
||
| Gradient accumulation steps | 16 |
|
||
| Effective batch size | 64 |
|
||
| Learning rate | 2e-4 |
|
||
| LR scheduler | Cosine with warmup |
|
||
| Warmup ratio | 0.05 |
|
||
| Max sequence length | 1,024 tokens |
|
||
| Optimizer | AdamW (paged, 8-bit) |
|
||
| Precision | QLoRA (4-bit base + BF16 adapters) |
|
||
| Gradient checkpointing | Enabled |
|
||
| Loss masking | Assistant turns only (`assistant_only_loss=True`) |
|
||
|
||
### LoRA Configuration
|
||
|
||
| Parameter | Value |
|
||
|---|---|
|
||
| Rank (r) | 64 |
|
||
| Alpha (α) | 128 |
|
||
| Dropout | 0.05 |
|
||
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
||
| Trainable parameters | 132,120,576 |
|
||
| % of total parameters | 5.65% of 4.15B |
|
||
|
||
### Results
|
||
|
||
| Metric | Value |
|
||
|---|---|
|
||
| Final train loss | 1.8918 |
|
||
| Best eval loss | 2.4094 |
|
||
| Total training steps | 843 |
|
||
| Training duration | 7.59 hours |
|
||
| Token accuracy (start → end) | ~50% → ~60% |
|
||
| MCQ benchmark | Coming soon (requires vLLM serving) |
|
||
|
||
### Hardware
|
||
|
||
| Component | Specification |
|
||
|---|---|
|
||
| GPUs | 4× NVIDIA GeForce RTX 2080 Ti (11 GB VRAM each, 44 GB total) |
|
||
| CUDA version | 13.0 |
|
||
| Training framework | DDP via Hugging Face Accelerate |
|
||
|
||
### Software Environment
|
||
|
||
| Library | Version |
|
||
|---|---|
|
||
| Python | 3.11.15 |
|
||
| PyTorch | 2.11.0+cu130 |
|
||
| Transformers | 4.57.6 |
|
||
| PEFT | 0.18.1 |
|
||
| TRL | 1.3.0 |
|
||
| BitsAndBytes | 0.49.2 |
|
||
| Accelerate | 1.13.0 |
|
||
|
||
---
|
||
|
||
## Limitations
|
||
|
||
- **Domain scope**: The model is optimized for Islamic Arabic Q&A. General Arabic tasks or non-Islamic domains may show degraded quality compared to the base Qwen3-4B.
|
||
- **Source attribution**: While the model is trained to cite sources, citations should be independently verified — the model can produce plausible-sounding but incorrect references.
|
||
- **Classical vs. contemporary Fiqh**: The training data emphasizes classical scholarship. Contemporary jurisprudential debates, especially minority or regional opinions, may be underrepresented.
|
||
- **Language**: The model performs best in Arabic (Modern Standard and Classical). Responses in other languages are not guaranteed to be accurate or fluent.
|
||
|
||
---
|
||
|
||
## Citation
|
||
|
||
```bibtex
|
||
@misc{alnwsany2026qwen3islamicarbic,
|
||
author = {Yahya Alnwsany},
|
||
title = {Qwen3-4B-Islamic-Arabic: QLoRA Fine-Tuning of Qwen3-4B on Islamic Arabic Q\&A},
|
||
year = {2026},
|
||
howpublished = {\url{https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic}},
|
||
note = {Base model: Qwen/Qwen3-4B. Dataset: NightPrince/islamic-arabic-qa.}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## License
|
||
|
||
This model is released under the **Apache 2.0** license, consistent with the base model [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B). See [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for details.
|