---
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- sanskrit
- translation
- transliteration
- qwen
- axolotl
- iast
- devanagari
- bilingual
datasets:
- diabolic6045/Sanskrit-transliteration-chat-dataset
model-index:
- name: Sanskrit-qwen-7B-Translate-v2
results: []
---
# Sanskrit-qwen-7B-Translate-v2



**A specialized Sanskrit language model for translation and transliteration tasks**
## 🌟 Model Description
This is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) specifically optimized for Sanskrit language processing. The model has been trained using LoRA (Low-Rank Adaptation) on a comprehensive Sanskrit dataset to excel in three key areas:
1. **Sanskrit to IAST Transliteration** - Converting Devanagari script to IAST format
2. **Sanskrit to English Translation** - Translating Sanskrit text to English
3. **English to Sanskrit Translation** - Translating English text to Sanskrit
## 🚀 Key Features
### ✨ **Multi-Modal Sanskrit Processing**
- **IAST Transliteration**: Accurate conversion from Devanagari to IAST
- **Bidirectional Translation**: Sanskrit ↔ English translation
- **Context-Aware**: Preserves meaning and cultural context
- **Chat-Optimized**: Uses conversation format for natural interactions
### 🔧 **Technical Improvements Over Previous Model**
- **Enhanced Base Model**: Upgraded from Qwen2.5-7B-Instruct-1M to Qwen2.5-7B-Instruct
- **Specialized Dataset**: Trained on `Sanskrit-transliteration-chat-dataset` (vs. previous `Sanskrit-llama`)
- **Chat Template Format**: Uses structured conversation format for better performance
- **Optimized LoRA**: Improved LoRA configuration with better target modules
- **Memory Efficient**: Enhanced with flash attention and gradient checkpointing
## 📊 Model Specifications
| Parameter | Value |
|-----------|-------|
| **Base Model** | Qwen/Qwen2.5-7B-Instruct |
| **Fine-tuning Method** | LoRA (Low-Rank Adaptation) |
| **LoRA Rank** | 16 |
| **LoRA Alpha** | 32 |
| **Sequence Length** | 512 tokens |
| **Training Epochs** | 3 |
| **Learning Rate** | 2e-05 |
| **Batch Size** | 2 (micro) × 4 (gradient accumulation) |
| **Optimizer** | AdamW 8-bit |
| **Precision** | bfloat16 |
## 🎯 Intended Uses
### ✅ **Recommended Use Cases**
- **Academic Research**: Sanskrit text analysis and translation
- **Educational Tools**: Learning Sanskrit through translation
- **Cultural Preservation**: Digitizing Sanskrit manuscripts
- **Linguistic Studies**: Comparative language analysis
- **Content Creation**: Sanskrit-English bilingual content
### ⚠️ **Limitations**
- **Experimental Model**: Still in development, results may vary
- **Context Sensitivity**: Performance depends on text complexity
- **Domain Specific**: Optimized for classical Sanskrit texts
- **Verification Required**: Important translations should be cross-checked
## 🛠️ Usage Examples
### 1. Sanskrit to IAST Transliteration
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Prepare the conversation
messages = [
{
"role": "system",
"content": "You are a Sanskrit transliteration expert. Convert the given Sanskrit text from Devanagari script to IAST (International Alphabet of Sanskrit Transliteration) format."
},
{
"role": "user",
"content": "Transliterate this Sanskrit text to IAST: बुद्धिश्चार्थात्परो लोभः सन्तोषः परमं सुखम् ।"
}
]
# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham |
```
### 2. Sanskrit to English Translation
```python
messages = [
{
"role": "system",
"content": "You are a Sanskrit to English translation expert. Translate the given Sanskrit text accurately while preserving the meaning and context."
},
{
"role": "user",
"content": "Translate this Sanskrit text to English: यद॒ग्नौ सूर्ये॑ वि॒षं पृ॑थि॒व्यामोष॑धीषु॒ यत् ।"
}
]
# Generate translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: The poison that is in the sun, in the earth and in the herbs...
```
### 3. English to Sanskrit Translation
```python
messages = [
{
"role": "system",
"content": "You are an English to Sanskrit translation expert. Translate the given English text accurately into Sanskrit while preserving the meaning and context."
},
{
"role": "user",
"content": "Translate this English text to Sanskrit: May the divine powers protect us and grant us wisdom."
}
]
# Generate Sanskrit translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: देवाः अस्मान् रक्षन्तु बुद्धिं च प्रयच्छन्तु ।
```
## 🎮 Interactive Demo
Try the model with our Gradio interface:
### Run the interactive [demo](https://huggingface.co/spaces/diabolic6045/Sanskrit-qwen-7B-Translate-v2)
The demo provides:
- **Mode Selection**: Choose between transliteration and translation modes
- **Real-time Processing**: Instant results with adjustable parameters
- **Example Library**: Pre-loaded examples for each mode
- **Parameter Tuning**: Adjust temperature and max length
## 📈 Training Details
### Dataset Information
- **Source**: `diabolic6045/Sanskrit-transliteration-chat-dataset`
- **Format**: Chat template with structured conversations
- **Size**: Comprehensive Sanskrit corpus with multiple translation pairs
- **Validation Split**: 10% for evaluation
### Training Configuration
```yaml
# Key training parameters
base_model: Qwen/Qwen2.5-7B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
sequence_len: 512
num_epochs: 3
learning_rate: 0.00002
optimizer: adamw_8bit
lr_scheduler: cosine
bf16: auto
flash_attention: true
gradient_checkpointing: true
```
### Hardware Requirements
- **Training**: Multi-GPU setup with 24GB+ VRAM per GPU
- **Inference**: 8GB+ VRAM for optimal performance
- **CPU**: Compatible with CPU inference (slower)
## 🔄 Comparison with Previous Model
| Feature | Previous Model | Current Model |
|---------|---------------|---------------|
| **Base Model** | Qwen2.5-7B-Instruct-1M | Qwen2.5-7B-Instruct |
| **Dataset** | Sanskrit-llama (Alpaca) | Sanskrit-transliteration-chat-dataset |
| **Format** | Alpaca format | Chat template format |
| **Capabilities** | Basic translation | Multi-modal (transliteration + translation) |
| **LoRA Rank** | 32 | 16 (optimized) |
| **Sequence Length** | 1024 | 512 (focused) |
| **Training Epochs** | 1 | 3 (more thorough) |
| **Specialization** | General Sanskrit | Specialized for transliteration |
## 🛡️ Ethical Considerations
- **Cultural Sensitivity**: Respect for Sanskrit's cultural and religious significance
- **Accuracy Disclaimer**: Model outputs should be verified for important translations
- **Educational Use**: Primarily intended for educational and research purposes
- **Bias Awareness**: May reflect biases present in training data
## 📚 Citation
If you use this model in your research, please cite:
```bibtex
@misc{sanskrit-qwen-chat-lora,
title={Sanskrit-qwen-7B-Translate-v2: A Specialized Sanskrit Translation and Transliteration Model},
author={Divax Shah (diabolic6045)},
year={2024},
url={https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2}
}
```
## 🤝 Contributing
We welcome contributions to improve this model:
1. **Dataset Contributions**: High-quality Sanskrit translation pairs
2. **Evaluation**: Benchmarking and performance analysis
3. **Bug Reports**: Issues and improvement suggestions
4. **Documentation**: Usage examples and tutorials
## 📄 License
This model is released under the Apache 2.0 License. See the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
- **Qwen Team**: For the excellent base model
- **Axolotl Framework**: For the training infrastructure
- **Sanskrit Community**: For linguistic guidance and feedback
- **Open Source Community**: For tools and resources
---
**Built with ❤️ for Sanskrit language preservation and education**
[

](https://github.com/axolotl-ai-cloud/axolotl)