Developed by Khurram Pervez (Khurramcoder), this model is a fine-tuned version of Meta's Llama-3.2-3B-Instruct, specifically optimized for high-quality Urdu instruction following and generation.
Model Highlights
Native Urdu Reasoning: Trained on the large-traversaal/urdu-instruct dataset (51.7k rows), enabling the model to handle translation, creative writing, and QA tasks with cultural nuance.
Efficient Architecture: Fine-tuned using Unsloth and QLoRA on an NVIDIA RTX 4060 Ti, making it a powerful yet lightweight 3B parameter model.
Optimized for 2026: Uses the latest Llama 3.2 multilingual tokenizer for better Urdu script handling.
How to Use
fromtransformersimportAutoModelForCausalLM,AutoTokenizermodel_name="Khurram123/Urdu-Llama-3.2-3B-Instruct-v1"tokenizer=AutoTokenizer.from_pretrained(model_name)model=AutoModelForCausalLM.from_pretrained(model_name,device_map="auto")instruction="مصنوعی ذہانت کے مستقبل پر ایک مختصر نوٹ لکھیں۔"prompt=f"### ہدایت:\n{instruction}\n\n### جواب:\n"inputs=tokenizer(prompt,return_tensors="pt").to("cuda")outputs=model.generate(**inputs,max_new_tokens=256)print(tokenizer.decode(outputs[0],skip_special_tokens=True))