🦅 Shaheen-Qwen2.5-3B-Kulliyat-e-Iqbal (v1.0) 🇵🇰

---
license: apache-2.0
base_model: Qwen/Qwen2.5-3B-Instruct
datasets:
- Khurram123/kulliyat-e-iqbal-shaheen
tags:
- iqbaliat
- urdu-poetry
- persian-poetry
- philosophy
- sufism
- unsloth
- qwen
- shaheen
- conversational
language:
- ur
- fa
metrics:
- loss
library_name: transformers
pipeline_tag: text-generation
model_name: Shaheen-Qwen2.5-3B-Kulliyat-e-Iqbal
---

<p align="center">
  <br>
  <b style="font-size: 26px;">"تو شاہیں ہے، پرواز ہے کام تیرا"</b> <br>
  <b style="font-size: 18px;">— علامہ اقبال کے کلام اور فلسفے پر مبنی پہلا سپیشلائزڈ لسانی ماڈل</b>
</p>

<p align="center">
  <img src="https://huggingface.co/Khurram123/Shaheen-3B-Kulliyat-e-Iqbal/resolve/main/image1.png" width="450" alt="Shaheen Iqbal Model Logo">
</p>

<h1 align="center">🦅 Shaheen-Qwen2.5-3B-Kulliyat-e-Iqbal (v1.0) 🇵🇰</h1>

**Shaheen-Qwen2.5-3B** is a specialized Large Language Model fine-tuned on the complete poetic works of **Allama Muhammad Iqbal**. Built on the **Qwen 2.5 3B Instruct** architecture, this model is designed to interpret, explain, and contextualize the philosophical depth of **Iqbaliyat** in both Urdu and Persian.

Using the **Kulliyat-e-Iqbal Shaheen Dataset** (11,659 records), this model bridges the gap between classical wisdom and modern conversational AI.

---

## 🌟 Model Highlights
- **Specialized Knowledge:** Deeply trained on 11 major Urdu and Persian books of Allama Iqbal.
- **Bilingual Proficiency:** Capable of understanding and explaining Persian (Farsi) couplets in simple Urdu.
- **Philosophical Insight:** Optimized to discuss core concepts like **Khudi (Selfhood)**, **Ishq (Divine Love)**, and **Shaheen (The Eagle)**.
- **Lightweight & Efficient:** 3 Billion parameters ensure lightning-fast inference on consumer-grade hardware (like RTX 4060 Ti).
- **Optimization:** Fine-tuned using **Unsloth** with 4-bit LoRA for maximum performance with minimum VRAM footprint.

---

## 📊 Training Details
- **Base Model:** `unsloth/qwen2.5-3b-instruct-bnb-4bit`
- **Dataset Size:** 11,659 Rows (Instruction-Response Pairs)
- **Epochs:** 1.37
- **Final Train Loss:** 1.39
- **Hardware:** NVIDIA GeForce RTX 4060 Ti (16GB)
- **Software:** Ubuntu Linux + Unsloth AI

---

## 📚 Dataset Composition
The model has "read" and analyzed the entire poetic corpus:
| Language | Primary Books Included |
| :--- | :--- |
| **Urdu** | Bang-e-Dara, Bal-e-Jibreel, Zarb-e-Kaleem, Armaghan-e-Hijaz (Urdu) |
| **Persian** | Asrar-e-Khudi, Rumuz-e-Bekhudi, Payam-e-Mashriq, Zabur-e-Ajam, Javid Nama, Pas Cha Bayad Kard |

---

## 🚀 How to Use (Inference)

### Via Transformers & Unsloth
```python
from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Khurram123/Shaheen-3B-Kulliyat-e-Iqbal",
    max_seq_length = 2048,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

prompt = "علامہ اقبال کے تصورِ خودی کا پیغام کیا ہے؟"
inputs = tokenizer([f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"], return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens=250, temperature=0.5)
print(tokenizer.batch_decode(outputs)[0])