--- language: - ar license: apache-2.0 library_name: transformers base_model: Qwen/Qwen3-4B datasets: - NightPrince/islamic-arabic-qa tags: - arabic - islamic - fiqh - fatwa - qlora - peft - qwen3 - instruction-tuning - conversational pipeline_tag: text-generation inference: parameters: max_new_tokens: 512 temperature: 0.3 do_sample: true widget: - text: "ما حكم زكاة الفطر وما مقدارها؟" example_title: "زكاة الفطر" - text: "ما الفرق بين الفرض والواجب عند الحنفية؟" example_title: "الفرض والواجب" - text: "ما حكم بيع العينة في الفقه الإسلامي؟" example_title: "بيع العينة" - text: "ما شروط صحة الصلاة؟" example_title: "شروط الصلاة" model-index: - name: Qwen3-4B-Islamic-Arabic results: - task: type: text-generation name: Text Generation dataset: name: Islamic Arabic Q&A type: NightPrince/islamic-arabic-qa split: validation metrics: - name: Validation Loss type: loss value: 2.4094 verified: false --- # Qwen3-4B-Islamic-Arabic **Qwen3-4B fine-tuned on Islamic Arabic Q&A via QLoRA — merged FP16, ready for direct inference.** This is the canonical, fully merged version of a Qwen3-4B model fine-tuned on 17,944 high-quality Islamic Arabic question-answer pairs spanning Fiqh, Fatwa, Aqeedah, Quran Sciences, and Islamic Finance. The LoRA adapter has been merged into the base weights and saved in FP16; no additional adapter loading is required. Trained by **[Yahya Alnwsany (NightPrince)](https://huggingface.co/NightPrince)** — 2026-05-05. --- ## Model Variants | Variant | Repo | Description | |---|---|---| | **Merged FP16** (this model) | [NightPrince/Qwen3-4B-Islamic-Arabic](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic) | Canonical merged model, FP16, ~7.6 GB — drop-in for transformers or vLLM | | **LoRA Adapter** | [NightPrince/Qwen3-4B-Islamic-Arabic-LoRA](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-LoRA) | PEFT adapter only, 264 MB — apply on top of `Qwen/Qwen3-4B` | | **INT4 Quantized** | [NightPrince/Qwen3-4B-Islamic-Arabic-INT4](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-INT4) | W4A16 compressed-tensors for fast vLLM serving, 2.5 GB | | **MLX 4-bit** | [NightPrince/Qwen3-4B-Islamic-Arabic-mlx-4Bit](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-mlx-4Bit) | Apple Silicon / MLX — native Mac inference, 4-bit quantized | | **GGUF** | [NightPrince/Qwen3-4B-Islamic-Arabic-GGUF](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-GGUF) | llama.cpp / Ollama / LM Studio — Q4_K_M (2.3 GB), Q8_0 (4.0 GB), F16 (7.5 GB) | | **Dataset** | [NightPrince/islamic-arabic-qa](https://huggingface.co/datasets/NightPrince/islamic-arabic-qa) | 17,944 train / 2,101 val / 1,042 test — Islamic Arabic Q&A pairs | --- ## Training Metrics ### Loss Curve | Checkpoint | Train Loss | Eval Loss | |---|---|---| | Step 0 (init) | — | — | | Step 843 (final) | **1.8918** | **2.4094** (best) | ### Token Accuracy | Phase | Token Accuracy | |---|---| | Early training | ~50% | | End of training | ~60% | > **MCQ evaluation coming soon** — a multiple-choice benchmark (Islamics domain) is prepared but requires serving the model via vLLM. Results will be posted here once available. --- ## Usage ### Transformers Inference ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "NightPrince/Qwen3-4B-Islamic-Arabic" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.float16, device_map="auto", ) SYSTEM_PROMPT = ( "أنت مساعد عالم إسلامي متخصص. " "أجب على الأسئلة بدقة استناداً إلى القرآن الكريم والسنة النبوية والفقه الإسلامي الكلاسيكي. " "استشهد بالمصادر حيثما أمكن. كن موجزاً لكن شاملاً." ) messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": "ما حكم الزكاة على المال المدخر؟"}, ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, ) inputs = tokenizer(text, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=512, temperature=0.7, top_p=0.9, do_sample=True, ) response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True) print(response) ``` ### vLLM Serving The merged FP16 model is ~7.6 GB. Use at least `tensor_parallel_size=2` on 11 GB GPUs (e.g., RTX 2080 Ti), or a single 24 GB+ GPU. ```bash # Install vLLM if needed pip install vllm # Serve with tensor parallelism across 2 GPUs vllm serve NightPrince/Qwen3-4B-Islamic-Arabic \ --dtype float16 \ --tensor-parallel-size 2 \ --max-model-len 4096 \ --port 8000 ``` Query the running server: ```python from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="token-abc123") SYSTEM_PROMPT = ( "أنت مساعد عالم إسلامي متخصص. " "أجب على الأسئلة بدقة استناداً إلى القرآن الكريم والسنة النبوية والفقه الإسلامي الكلاسيكي. " "استشهد بالمصادر حيثما أمكن. كن موجزاً لكن شاملاً." ) response = client.chat.completions.create( model="NightPrince/Qwen3-4B-Islamic-Arabic", messages=[ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": "ما حكم الزكاة على المال المدخر؟"}, ], max_tokens=512, temperature=0.7, ) print(response.choices[0].message.content) ``` > **Prefer lower memory?** Use the [INT4 quantized variant](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-INT4) (2.5 GB) for vLLM or the [GGUF variant](https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic-GGUF) for llama.cpp / Ollama. --- ## Training Details ### Dataset | Property | Value | |---|---| | Dataset | [NightPrince/islamic-arabic-qa](https://huggingface.co/datasets/NightPrince/islamic-arabic-qa) | | Train split | 17,944 samples | | Validation split | 2,101 samples | | Test split | 1,042 samples | | Language | Arabic (Modern Standard + Classical) | | Domains | Fiqh, Fatwa, Aqeedah, Quran Sciences, Islamic Finance | | Quality filter | Applied — deduplication, length filtering, domain relevance scoring | | Format | Instruction-following (system / user / assistant) | ### Hyperparameters | Hyperparameter | Value | |---|---| | Epochs | 3 | | Per-device batch size | 1 | | Gradient accumulation steps | 16 | | Effective batch size | 64 | | Learning rate | 2e-4 | | LR scheduler | Cosine with warmup | | Warmup ratio | 0.05 | | Max sequence length | 1,024 tokens | | Optimizer | AdamW (paged, 8-bit) | | Precision | QLoRA (4-bit base + BF16 adapters) | | Gradient checkpointing | Enabled | | Loss masking | Assistant turns only (`assistant_only_loss=True`) | ### LoRA Configuration | Parameter | Value | |---|---| | Rank (r) | 64 | | Alpha (α) | 128 | | Dropout | 0.05 | | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | | Trainable parameters | 132,120,576 | | % of total parameters | 5.65% of 4.15B | ### Results | Metric | Value | |---|---| | Final train loss | 1.8918 | | Best eval loss | 2.4094 | | Total training steps | 843 | | Training duration | 7.59 hours | | Token accuracy (start → end) | ~50% → ~60% | | MCQ benchmark | Coming soon (requires vLLM serving) | ### Hardware | Component | Specification | |---|---| | GPUs | 4× NVIDIA GeForce RTX 2080 Ti (11 GB VRAM each, 44 GB total) | | CUDA version | 13.0 | | Training framework | DDP via Hugging Face Accelerate | ### Software Environment | Library | Version | |---|---| | Python | 3.11.15 | | PyTorch | 2.11.0+cu130 | | Transformers | 4.57.6 | | PEFT | 0.18.1 | | TRL | 1.3.0 | | BitsAndBytes | 0.49.2 | | Accelerate | 1.13.0 | --- ## Limitations - **Domain scope**: The model is optimized for Islamic Arabic Q&A. General Arabic tasks or non-Islamic domains may show degraded quality compared to the base Qwen3-4B. - **Source attribution**: While the model is trained to cite sources, citations should be independently verified — the model can produce plausible-sounding but incorrect references. - **Classical vs. contemporary Fiqh**: The training data emphasizes classical scholarship. Contemporary jurisprudential debates, especially minority or regional opinions, may be underrepresented. - **Language**: The model performs best in Arabic (Modern Standard and Classical). Responses in other languages are not guaranteed to be accurate or fluent. --- ## Citation ```bibtex @misc{alnwsany2026qwen3islamicarbic, author = {Yahya Alnwsany}, title = {Qwen3-4B-Islamic-Arabic: QLoRA Fine-Tuning of Qwen3-4B on Islamic Arabic Q\&A}, year = {2026}, howpublished = {\url{https://huggingface.co/NightPrince/Qwen3-4B-Islamic-Arabic}}, note = {Base model: Qwen/Qwen3-4B. Dataset: NightPrince/islamic-arabic-qa.} } ``` --- ## License This model is released under the **Apache 2.0** license, consistent with the base model [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B). See [LICENSE](https://www.apache.org/licenses/LICENSE-2.0) for details.