Model: diabolic6045/Sanskrit-qwen-7B-Translate-v2 Source: Original Platform
library_name, license, base_model, tags, datasets, model-index
| library_name | license | base_model | tags | datasets | model-index | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| transformers | apache-2.0 | Qwen/Qwen2.5-7B-Instruct |
|
|
|
Sanskrit-qwen-7B-Translate-v2
🌟 Model Description
This is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct specifically optimized for Sanskrit language processing. The model has been trained using LoRA (Low-Rank Adaptation) on a comprehensive Sanskrit dataset to excel in three key areas:
- Sanskrit to IAST Transliteration - Converting Devanagari script to IAST format
- Sanskrit to English Translation - Translating Sanskrit text to English
- English to Sanskrit Translation - Translating English text to Sanskrit
🚀 Key Features
✨ Multi-Modal Sanskrit Processing
- IAST Transliteration: Accurate conversion from Devanagari to IAST
- Bidirectional Translation: Sanskrit ↔ English translation
- Context-Aware: Preserves meaning and cultural context
- Chat-Optimized: Uses conversation format for natural interactions
🔧 Technical Improvements Over Previous Model
- Enhanced Base Model: Upgraded from Qwen2.5-7B-Instruct-1M to Qwen2.5-7B-Instruct
- Specialized Dataset: Trained on
Sanskrit-transliteration-chat-dataset(vs. previousSanskrit-llama) - Chat Template Format: Uses structured conversation format for better performance
- Optimized LoRA: Improved LoRA configuration with better target modules
- Memory Efficient: Enhanced with flash attention and gradient checkpointing
📊 Model Specifications
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen2.5-7B-Instruct |
| Fine-tuning Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
| Sequence Length | 512 tokens |
| Training Epochs | 3 |
| Learning Rate | 2e-05 |
| Batch Size | 2 (micro) × 4 (gradient accumulation) |
| Optimizer | AdamW 8-bit |
| Precision | bfloat16 |
🎯 Intended Uses
✅ Recommended Use Cases
- Academic Research: Sanskrit text analysis and translation
- Educational Tools: Learning Sanskrit through translation
- Cultural Preservation: Digitizing Sanskrit manuscripts
- Linguistic Studies: Comparative language analysis
- Content Creation: Sanskrit-English bilingual content
⚠️ Limitations
- Experimental Model: Still in development, results may vary
- Context Sensitivity: Performance depends on text complexity
- Domain Specific: Optimized for classical Sanskrit texts
- Verification Required: Important translations should be cross-checked
🛠️ Usage Examples
1. Sanskrit to IAST Transliteration
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Prepare the conversation
messages = [
{
"role": "system",
"content": "You are a Sanskrit transliteration expert. Convert the given Sanskrit text from Devanagari script to IAST (International Alphabet of Sanskrit Transliteration) format."
},
{
"role": "user",
"content": "Transliterate this Sanskrit text to IAST: बुद्धिश्चार्थात्परो लोभः सन्तोषः परमं सुखम् ।"
}
]
# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham |
2. Sanskrit to English Translation
messages = [
{
"role": "system",
"content": "You are a Sanskrit to English translation expert. Translate the given Sanskrit text accurately while preserving the meaning and context."
},
{
"role": "user",
"content": "Translate this Sanskrit text to English: यद॒ग्नौ सूर्ये॑ वि॒षं पृ॑थि॒व्यामोष॑धीषु॒ यत् ।"
}
]
# Generate translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: The poison that is in the sun, in the earth and in the herbs...
3. English to Sanskrit Translation
messages = [
{
"role": "system",
"content": "You are an English to Sanskrit translation expert. Translate the given English text accurately into Sanskrit while preserving the meaning and context."
},
{
"role": "user",
"content": "Translate this English text to Sanskrit: May the divine powers protect us and grant us wisdom."
}
]
# Generate Sanskrit translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: देवाः अस्मान् रक्षन्तु बुद्धिं च प्रयच्छन्तु ।
🎮 Interactive Demo
Try the model with our Gradio interface:
Run the interactive demo
The demo provides:
- Mode Selection: Choose between transliteration and translation modes
- Real-time Processing: Instant results with adjustable parameters
- Example Library: Pre-loaded examples for each mode
- Parameter Tuning: Adjust temperature and max length
📈 Training Details
Dataset Information
- Source:
diabolic6045/Sanskrit-transliteration-chat-dataset - Format: Chat template with structured conversations
- Size: Comprehensive Sanskrit corpus with multiple translation pairs
- Validation Split: 10% for evaluation
Training Configuration
# Key training parameters
base_model: Qwen/Qwen2.5-7B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
sequence_len: 512
num_epochs: 3
learning_rate: 0.00002
optimizer: adamw_8bit
lr_scheduler: cosine
bf16: auto
flash_attention: true
gradient_checkpointing: true
Hardware Requirements
- Training: Multi-GPU setup with 24GB+ VRAM per GPU
- Inference: 8GB+ VRAM for optimal performance
- CPU: Compatible with CPU inference (slower)
🔄 Comparison with Previous Model
| Feature | Previous Model | Current Model |
|---|---|---|
| Base Model | Qwen2.5-7B-Instruct-1M | Qwen2.5-7B-Instruct |
| Dataset | Sanskrit-llama (Alpaca) | Sanskrit-transliteration-chat-dataset |
| Format | Alpaca format | Chat template format |
| Capabilities | Basic translation | Multi-modal (transliteration + translation) |
| LoRA Rank | 32 | 16 (optimized) |
| Sequence Length | 1024 | 512 (focused) |
| Training Epochs | 1 | 3 (more thorough) |
| Specialization | General Sanskrit | Specialized for transliteration |
🛡️ Ethical Considerations
- Cultural Sensitivity: Respect for Sanskrit's cultural and religious significance
- Accuracy Disclaimer: Model outputs should be verified for important translations
- Educational Use: Primarily intended for educational and research purposes
- Bias Awareness: May reflect biases present in training data
📚 Citation
If you use this model in your research, please cite:
@misc{sanskrit-qwen-chat-lora,
title={Sanskrit-qwen-7B-Translate-v2: A Specialized Sanskrit Translation and Transliteration Model},
author={Divax Shah (diabolic6045)},
year={2024},
url={https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2}
}
🤝 Contributing
We welcome contributions to improve this model:
- Dataset Contributions: High-quality Sanskrit translation pairs
- Evaluation: Benchmarking and performance analysis
- Bug Reports: Issues and improvement suggestions
- Documentation: Usage examples and tutorials
📄 License
This model is released under the Apache 2.0 License. See the LICENSE file for details.
🙏 Acknowledgments
- Qwen Team: For the excellent base model
- Axolotl Framework: For the training infrastructure
- Sanskrit Community: For linguistic guidance and feedback
- Open Source Community: For tools and resources
Description
Languages
Jinja
100%
