ModelHub XC 40674fa96a 初始化项目,由ModelHub XC社区提供模型
Model: NightPrince/Qwen3-4B-Islamic-Arabic
Source: Original Platform
2026-05-26 01:34:18 +08:00

language, license, library_name, base_model, datasets, tags, pipeline_tag, inference, widget, model-index
language license library_name base_model datasets tags pipeline_tag inference widget model-index
ar
apache-2.0 transformers Qwen/Qwen3-4B
NightPrince/islamic-arabic-qa
arabic
islamic
fiqh
fatwa
qlora
peft
qwen3
instruction-tuning
conversational
text-generation
parameters
max_new_tokens temperature do_sample
512 0.3 true
text example_title
ما حكم زكاة الفطر وما مقدارها؟ زكاة الفطر
text example_title
ما الفرق بين الفرض والواجب عند الحنفية؟ الفرض والواجب
text example_title
ما حكم بيع العينة في الفقه الإسلامي؟ بيع العينة
text example_title
ما شروط صحة الصلاة؟ شروط الصلاة
name results
Qwen3-4B-Islamic-Arabic
task dataset metrics
type name
text-generation Text Generation
name type split
Islamic Arabic Q&A NightPrince/islamic-arabic-qa validation
name type value verified
Validation Loss loss 2.4094 false

Qwen3-4B-Islamic-Arabic

Qwen3-4B fine-tuned on Islamic Arabic Q&A via QLoRA — merged FP16, ready for direct inference.

This is the canonical, fully merged version of a Qwen3-4B model fine-tuned on 17,944 high-quality Islamic Arabic question-answer pairs spanning Fiqh, Fatwa, Aqeedah, Quran Sciences, and Islamic Finance. The LoRA adapter has been merged into the base weights and saved in FP16; no additional adapter loading is required.

Trained by Yahya Alnwsany (NightPrince) — 2026-05-05.


Model Variants

Variant Repo Description
Merged FP16 (this model) NightPrince/Qwen3-4B-Islamic-Arabic Canonical merged model, FP16, ~7.6 GB — drop-in for transformers or vLLM
LoRA Adapter NightPrince/Qwen3-4B-Islamic-Arabic-LoRA PEFT adapter only, 264 MB — apply on top of Qwen/Qwen3-4B
INT4 Quantized NightPrince/Qwen3-4B-Islamic-Arabic-INT4 W4A16 compressed-tensors for fast vLLM serving, 2.5 GB
MLX 4-bit NightPrince/Qwen3-4B-Islamic-Arabic-mlx-4Bit Apple Silicon / MLX — native Mac inference, 4-bit quantized
GGUF NightPrince/Qwen3-4B-Islamic-Arabic-GGUF llama.cpp / Ollama / LM Studio — Q4_K_M (2.3 GB), Q8_0 (4.0 GB), F16 (7.5 GB)
Dataset NightPrince/islamic-arabic-qa 17,944 train / 2,101 val / 1,042 test — Islamic Arabic Q&A pairs

Training Metrics

Loss Curve

Checkpoint Train Loss Eval Loss
Step 0 (init)
Step 843 (final) 1.8918 2.4094 (best)

Token Accuracy

Phase Token Accuracy
Early training ~50%
End of training ~60%

MCQ evaluation coming soon — a multiple-choice benchmark (Islamics domain) is prepared but requires serving the model via vLLM. Results will be posted here once available.


Usage

Transformers Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "NightPrince/Qwen3-4B-Islamic-Arabic"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

SYSTEM_PROMPT = (
    "أنت مساعد عالم إسلامي متخصص. "
    "أجب على الأسئلة بدقة استناداً إلى القرآن الكريم والسنة النبوية والفقه الإسلامي الكلاسيكي. "
    "استشهد بالمصادر حيثما أمكن. كن موجزاً لكن شاملاً."
)

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "ما حكم الزكاة على المال المدخر؟"},
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

vLLM Serving

The merged FP16 model is ~7.6 GB. Use at least tensor_parallel_size=2 on 11 GB GPUs (e.g., RTX 2080 Ti), or a single 24 GB+ GPU.

# Install vLLM if needed
pip install vllm

# Serve with tensor parallelism across 2 GPUs
vllm serve NightPrince/Qwen3-4B-Islamic-Arabic \
    --dtype float16 \
    --tensor-parallel-size 2 \
    --max-model-len 4096 \
    --port 8000

Query the running server:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="token-abc123")

SYSTEM_PROMPT = (
    "أنت مساعد عالم إسلامي متخصص. "
    "أجب على الأسئلة بدقة استناداً إلى القرآن الكريم والسنة النبوية والفقه الإسلامي الكلاسيكي. "
    "استشهد بالمصادر حيثما أمكن. كن موجزاً لكن شاملاً."
)

response = client.chat.completions.create(
    model="NightPrince/Qwen3-4B-Islamic-Arabic",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": "ما حكم الزكاة على المال المدخر؟"},
    ],
    max_tokens=512,
    temperature=0.7,
)
print(response.choices[0].message.content)

Prefer lower memory? Use the INT4 quantized variant (2.5 GB) for vLLM or the GGUF variant for llama.cpp / Ollama.


Training Details

Dataset

Property Value
Dataset NightPrince/islamic-arabic-qa
Train split 17,944 samples
Validation split 2,101 samples
Test split 1,042 samples
Language Arabic (Modern Standard + Classical)
Domains Fiqh, Fatwa, Aqeedah, Quran Sciences, Islamic Finance
Quality filter Applied — deduplication, length filtering, domain relevance scoring
Format Instruction-following (system / user / assistant)

Hyperparameters

Hyperparameter Value
Epochs 3
Per-device batch size 1
Gradient accumulation steps 16
Effective batch size 64
Learning rate 2e-4
LR scheduler Cosine with warmup
Warmup ratio 0.05
Max sequence length 1,024 tokens
Optimizer AdamW (paged, 8-bit)
Precision QLoRA (4-bit base + BF16 adapters)
Gradient checkpointing Enabled
Loss masking Assistant turns only (assistant_only_loss=True)

LoRA Configuration

Parameter Value
Rank (r) 64
Alpha (α) 128
Dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameters 132,120,576
% of total parameters 5.65% of 4.15B

Results

Metric Value
Final train loss 1.8918
Best eval loss 2.4094
Total training steps 843
Training duration 7.59 hours
Token accuracy (start → end) ~50% → ~60%
MCQ benchmark Coming soon (requires vLLM serving)

Hardware

Component Specification
GPUs 4× NVIDIA GeForce RTX 2080 Ti (11 GB VRAM each, 44 GB total)
CUDA version 13.0
Training framework DDP via Hugging Face Accelerate

Software Environment

Library Version
Python 3.11.15
PyTorch 2.11.0+cu130
Transformers 4.57.6
PEFT 0.18.1
TRL 1.3.0
BitsAndBytes 0.49.2
Accelerate 1.13.0

Limitations

  • Domain scope: The model is optimized for Islamic Arabic Q&A. General Arabic tasks or non-Islamic domains may show degraded quality compared to the base Qwen3-4B.
  • Source attribution: While the model is trained to cite sources, citations should be independently verified — the model can produce plausible-sounding but incorrect references.
  • Classical vs. contemporary Fiqh: The training data emphasizes classical scholarship. Contemporary jurisprudential debates, especially minority or regional opinions, may be underrepresented.
  • Language: The model performs best in Arabic (Modern Standard and Classical). Responses in other languages are not guaranteed to be accurate or fluent.

Citation

@misc{alnwsany2026qwen3islamicarbic,
  author       = {Yahya Alnwsany},
  title        = {Qwen3-4B-Islamic-Arabic: QLoRA Fine-Tuning of Qwen3-4B on Islamic Arabic Q\&A},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic}},
  note         = {Base model: Qwen/Qwen3-4B. Dataset: NightPrince/islamic-arabic-qa.}
}

License

This model is released under the Apache 2.0 license, consistent with the base model Qwen/Qwen3-4B. See LICENSE for details.

Description
Model synced from source: NightPrince/Qwen3-4B-Islamic-Arabic
Readme 2 MiB
Languages
Jinja 100%