--- library_name: transformers license: apache-2.0 base_model: Qwen/Qwen2.5-7B-Instruct tags: - sanskrit - translation - transliteration - qwen - axolotl - iast - devanagari - bilingual datasets: - diabolic6045/Sanskrit-transliteration-chat-dataset model-index: - name: Sanskrit-qwen-7B-Translate-v2 results: [] --- # Sanskrit-qwen-7B-Translate-v2
Sanskrit AI Poster ![Sanskrit Model](https://img.shields.io/badge/Sanskrit-Translation-blue) ![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen) **A specialized Sanskrit language model for translation and transliteration tasks**
## 🌟 Model Description This is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) specifically optimized for Sanskrit language processing. The model has been trained using LoRA (Low-Rank Adaptation) on a comprehensive Sanskrit dataset to excel in three key areas: 1. **Sanskrit to IAST Transliteration** - Converting Devanagari script to IAST format 2. **Sanskrit to English Translation** - Translating Sanskrit text to English 3. **English to Sanskrit Translation** - Translating English text to Sanskrit ## 🚀 Key Features ### ✨ **Multi-Modal Sanskrit Processing** - **IAST Transliteration**: Accurate conversion from Devanagari to IAST - **Bidirectional Translation**: Sanskrit ↔ English translation - **Context-Aware**: Preserves meaning and cultural context - **Chat-Optimized**: Uses conversation format for natural interactions ### 🔧 **Technical Improvements Over Previous Model** - **Enhanced Base Model**: Upgraded from Qwen2.5-7B-Instruct-1M to Qwen2.5-7B-Instruct - **Specialized Dataset**: Trained on `Sanskrit-transliteration-chat-dataset` (vs. previous `Sanskrit-llama`) - **Chat Template Format**: Uses structured conversation format for better performance - **Optimized LoRA**: Improved LoRA configuration with better target modules - **Memory Efficient**: Enhanced with flash attention and gradient checkpointing ## 📊 Model Specifications | Parameter | Value | |-----------|-------| | **Base Model** | Qwen/Qwen2.5-7B-Instruct | | **Fine-tuning Method** | LoRA (Low-Rank Adaptation) | | **LoRA Rank** | 16 | | **LoRA Alpha** | 32 | | **Sequence Length** | 512 tokens | | **Training Epochs** | 3 | | **Learning Rate** | 2e-05 | | **Batch Size** | 2 (micro) × 4 (gradient accumulation) | | **Optimizer** | AdamW 8-bit | | **Precision** | bfloat16 | ## 🎯 Intended Uses ### ✅ **Recommended Use Cases** - **Academic Research**: Sanskrit text analysis and translation - **Educational Tools**: Learning Sanskrit through translation - **Cultural Preservation**: Digitizing Sanskrit manuscripts - **Linguistic Studies**: Comparative language analysis - **Content Creation**: Sanskrit-English bilingual content ### ⚠️ **Limitations** - **Experimental Model**: Still in development, results may vary - **Context Sensitivity**: Performance depends on text complexity - **Domain Specific**: Optimized for classical Sanskrit texts - **Verification Required**: Important translations should be cross-checked ## 🛠️ Usage Examples ### 1. Sanskrit to IAST Transliteration ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Prepare the conversation messages = [ { "role": "system", "content": "You are a Sanskrit transliteration expert. Convert the given Sanskrit text from Devanagari script to IAST (International Alphabet of Sanskrit Transliteration) format." }, { "role": "user", "content": "Transliterate this Sanskrit text to IAST: बुद्धिश्चार्थात्परो लोभः सन्तोषः परमं सुखम् ।" } ] # Apply chat template and generate text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7) response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True) print(response) # Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham | ``` ### 2. Sanskrit to English Translation ```python messages = [ { "role": "system", "content": "You are a Sanskrit to English translation expert. Translate the given Sanskrit text accurately while preserving the meaning and context." }, { "role": "user", "content": "Translate this Sanskrit text to English: यद॒ग्नौ सूर्ये॑ वि॒षं पृ॑थि॒व्यामोष॑धीषु॒ यत् ।" } ] # Generate translation text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7) response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True) print(response) # Output: The poison that is in the sun, in the earth and in the herbs... ``` ### 3. English to Sanskrit Translation ```python messages = [ { "role": "system", "content": "You are an English to Sanskrit translation expert. Translate the given English text accurately into Sanskrit while preserving the meaning and context." }, { "role": "user", "content": "Translate this English text to Sanskrit: May the divine powers protect us and grant us wisdom." } ] # Generate Sanskrit translation text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7) response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True) print(response) # Output: देवाः अस्मान् रक्षन्तु बुद्धिं च प्रयच्छन्तु । ``` ## 🎮 Interactive Demo Try the model with our Gradio interface: ### Run the interactive [demo](https://huggingface.co/spaces/diabolic6045/Sanskrit-qwen-7B-Translate-v2) The demo provides: - **Mode Selection**: Choose between transliteration and translation modes - **Real-time Processing**: Instant results with adjustable parameters - **Example Library**: Pre-loaded examples for each mode - **Parameter Tuning**: Adjust temperature and max length ## 📈 Training Details ### Dataset Information - **Source**: `diabolic6045/Sanskrit-transliteration-chat-dataset` - **Format**: Chat template with structured conversations - **Size**: Comprehensive Sanskrit corpus with multiple translation pairs - **Validation Split**: 10% for evaluation ### Training Configuration ```yaml # Key training parameters base_model: Qwen/Qwen2.5-7B-Instruct adapter: lora lora_r: 16 lora_alpha: 32 sequence_len: 512 num_epochs: 3 learning_rate: 0.00002 optimizer: adamw_8bit lr_scheduler: cosine bf16: auto flash_attention: true gradient_checkpointing: true ``` ### Hardware Requirements - **Training**: Multi-GPU setup with 24GB+ VRAM per GPU - **Inference**: 8GB+ VRAM for optimal performance - **CPU**: Compatible with CPU inference (slower) ## 🔄 Comparison with Previous Model | Feature | Previous Model | Current Model | |---------|---------------|---------------| | **Base Model** | Qwen2.5-7B-Instruct-1M | Qwen2.5-7B-Instruct | | **Dataset** | Sanskrit-llama (Alpaca) | Sanskrit-transliteration-chat-dataset | | **Format** | Alpaca format | Chat template format | | **Capabilities** | Basic translation | Multi-modal (transliteration + translation) | | **LoRA Rank** | 32 | 16 (optimized) | | **Sequence Length** | 1024 | 512 (focused) | | **Training Epochs** | 1 | 3 (more thorough) | | **Specialization** | General Sanskrit | Specialized for transliteration | ## 🛡️ Ethical Considerations - **Cultural Sensitivity**: Respect for Sanskrit's cultural and religious significance - **Accuracy Disclaimer**: Model outputs should be verified for important translations - **Educational Use**: Primarily intended for educational and research purposes - **Bias Awareness**: May reflect biases present in training data ## 📚 Citation If you use this model in your research, please cite: ```bibtex @misc{sanskrit-qwen-chat-lora, title={Sanskrit-qwen-7B-Translate-v2: A Specialized Sanskrit Translation and Transliteration Model}, author={Divax Shah (diabolic6045)}, year={2024}, url={https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2} } ``` ## 🤝 Contributing We welcome contributions to improve this model: 1. **Dataset Contributions**: High-quality Sanskrit translation pairs 2. **Evaluation**: Benchmarking and performance analysis 3. **Bug Reports**: Issues and improvement suggestions 4. **Documentation**: Usage examples and tutorials ## 📄 License This model is released under the Apache 2.0 License. See the [LICENSE](LICENSE) file for details. ## 🙏 Acknowledgments - **Qwen Team**: For the excellent base model - **Axolotl Framework**: For the training infrastructure - **Sanskrit Community**: For linguistic guidance and feedback - **Open Source Community**: For tools and resources ---
**Built with ❤️ for Sanskrit language preservation and education** [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)