Files
Sanskrit-qwen-7B-Translate-v2/README.md
ModelHub XC 4cf0552372 初始化项目,由ModelHub XC社区提供模型
Model: diabolic6045/Sanskrit-qwen-7B-Translate-v2
Source: Original Platform
2026-06-06 23:44:22 +08:00

274 lines
9.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
- sanskrit
- translation
- transliteration
- qwen
- axolotl
- iast
- devanagari
- bilingual
datasets:
- diabolic6045/Sanskrit-transliteration-chat-dataset
model-index:
- name: Sanskrit-qwen-7B-Translate-v2
results: []
---
# Sanskrit-qwen-7B-Translate-v2
<div align="center">
<img src="https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2/resolve/main/images/poster.png" alt="Sanskrit AI Poster" width="600" style="margin-bottom: 20px;">
![Sanskrit Model](https://img.shields.io/badge/Sanskrit-Translation-blue)
![License](https://img.shields.io/badge/License-Apache%202.0-brightgreen)
**A specialized Sanskrit language model for translation and transliteration tasks**
</div>
## 🌟 Model Description
This is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) specifically optimized for Sanskrit language processing. The model has been trained using LoRA (Low-Rank Adaptation) on a comprehensive Sanskrit dataset to excel in three key areas:
1. **Sanskrit to IAST Transliteration** - Converting Devanagari script to IAST format
2. **Sanskrit to English Translation** - Translating Sanskrit text to English
3. **English to Sanskrit Translation** - Translating English text to Sanskrit
## 🚀 Key Features
### ✨ **Multi-Modal Sanskrit Processing**
- **IAST Transliteration**: Accurate conversion from Devanagari to IAST
- **Bidirectional Translation**: Sanskrit ↔ English translation
- **Context-Aware**: Preserves meaning and cultural context
- **Chat-Optimized**: Uses conversation format for natural interactions
### 🔧 **Technical Improvements Over Previous Model**
- **Enhanced Base Model**: Upgraded from Qwen2.5-7B-Instruct-1M to Qwen2.5-7B-Instruct
- **Specialized Dataset**: Trained on `Sanskrit-transliteration-chat-dataset` (vs. previous `Sanskrit-llama`)
- **Chat Template Format**: Uses structured conversation format for better performance
- **Optimized LoRA**: Improved LoRA configuration with better target modules
- **Memory Efficient**: Enhanced with flash attention and gradient checkpointing
## 📊 Model Specifications
| Parameter | Value |
|-----------|-------|
| **Base Model** | Qwen/Qwen2.5-7B-Instruct |
| **Fine-tuning Method** | LoRA (Low-Rank Adaptation) |
| **LoRA Rank** | 16 |
| **LoRA Alpha** | 32 |
| **Sequence Length** | 512 tokens |
| **Training Epochs** | 3 |
| **Learning Rate** | 2e-05 |
| **Batch Size** | 2 (micro) × 4 (gradient accumulation) |
| **Optimizer** | AdamW 8-bit |
| **Precision** | bfloat16 |
## 🎯 Intended Uses
### ✅ **Recommended Use Cases**
- **Academic Research**: Sanskrit text analysis and translation
- **Educational Tools**: Learning Sanskrit through translation
- **Cultural Preservation**: Digitizing Sanskrit manuscripts
- **Linguistic Studies**: Comparative language analysis
- **Content Creation**: Sanskrit-English bilingual content
### ⚠️ **Limitations**
- **Experimental Model**: Still in development, results may vary
- **Context Sensitivity**: Performance depends on text complexity
- **Domain Specific**: Optimized for classical Sanskrit texts
- **Verification Required**: Important translations should be cross-checked
## 🛠️ Usage Examples
### 1. Sanskrit to IAST Transliteration
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Prepare the conversation
messages = [
{
"role": "system",
"content": "You are a Sanskrit transliteration expert. Convert the given Sanskrit text from Devanagari script to IAST (International Alphabet of Sanskrit Transliteration) format."
},
{
"role": "user",
"content": "Transliterate this Sanskrit text to IAST: बुद्धिश्चार्थात्परो लोभः सन्तोषः परमं सुखम् ।"
}
]
# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham |
```
### 2. Sanskrit to English Translation
```python
messages = [
{
"role": "system",
"content": "You are a Sanskrit to English translation expert. Translate the given Sanskrit text accurately while preserving the meaning and context."
},
{
"role": "user",
"content": "Translate this Sanskrit text to English: यद॒ग्नौ सूर्ये॑ वि॒षं पृ॑थि॒व्यामोष॑धीषु॒ यत् ।"
}
]
# Generate translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: The poison that is in the sun, in the earth and in the herbs...
```
### 3. English to Sanskrit Translation
```python
messages = [
{
"role": "system",
"content": "You are an English to Sanskrit translation expert. Translate the given English text accurately into Sanskrit while preserving the meaning and context."
},
{
"role": "user",
"content": "Translate this English text to Sanskrit: May the divine powers protect us and grant us wisdom."
}
]
# Generate Sanskrit translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)
# Output: देवाः अस्मान् रक्षन्तु बुद्धिं च प्रयच्छन्तु ।
```
## 🎮 Interactive Demo
Try the model with our Gradio interface:
### Run the interactive [demo](https://huggingface.co/spaces/diabolic6045/Sanskrit-qwen-7B-Translate-v2)
The demo provides:
- **Mode Selection**: Choose between transliteration and translation modes
- **Real-time Processing**: Instant results with adjustable parameters
- **Example Library**: Pre-loaded examples for each mode
- **Parameter Tuning**: Adjust temperature and max length
## 📈 Training Details
### Dataset Information
- **Source**: `diabolic6045/Sanskrit-transliteration-chat-dataset`
- **Format**: Chat template with structured conversations
- **Size**: Comprehensive Sanskrit corpus with multiple translation pairs
- **Validation Split**: 10% for evaluation
### Training Configuration
```yaml
# Key training parameters
base_model: Qwen/Qwen2.5-7B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
sequence_len: 512
num_epochs: 3
learning_rate: 0.00002
optimizer: adamw_8bit
lr_scheduler: cosine
bf16: auto
flash_attention: true
gradient_checkpointing: true
```
### Hardware Requirements
- **Training**: Multi-GPU setup with 24GB+ VRAM per GPU
- **Inference**: 8GB+ VRAM for optimal performance
- **CPU**: Compatible with CPU inference (slower)
## 🔄 Comparison with Previous Model
| Feature | Previous Model | Current Model |
|---------|---------------|---------------|
| **Base Model** | Qwen2.5-7B-Instruct-1M | Qwen2.5-7B-Instruct |
| **Dataset** | Sanskrit-llama (Alpaca) | Sanskrit-transliteration-chat-dataset |
| **Format** | Alpaca format | Chat template format |
| **Capabilities** | Basic translation | Multi-modal (transliteration + translation) |
| **LoRA Rank** | 32 | 16 (optimized) |
| **Sequence Length** | 1024 | 512 (focused) |
| **Training Epochs** | 1 | 3 (more thorough) |
| **Specialization** | General Sanskrit | Specialized for transliteration |
## 🛡️ Ethical Considerations
- **Cultural Sensitivity**: Respect for Sanskrit's cultural and religious significance
- **Accuracy Disclaimer**: Model outputs should be verified for important translations
- **Educational Use**: Primarily intended for educational and research purposes
- **Bias Awareness**: May reflect biases present in training data
## 📚 Citation
If you use this model in your research, please cite:
```bibtex
@misc{sanskrit-qwen-chat-lora,
title={Sanskrit-qwen-7B-Translate-v2: A Specialized Sanskrit Translation and Transliteration Model},
author={Divax Shah (diabolic6045)},
year={2024},
url={https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2}
}
```
## 🤝 Contributing
We welcome contributions to improve this model:
1. **Dataset Contributions**: High-quality Sanskrit translation pairs
2. **Evaluation**: Benchmarking and performance analysis
3. **Bug Reports**: Issues and improvement suggestions
4. **Documentation**: Usage examples and tutorials
## 📄 License
This model is released under the Apache 2.0 License. See the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
- **Qwen Team**: For the excellent base model
- **Axolotl Framework**: For the training infrastructure
- **Sanskrit Community**: For linguistic guidance and feedback
- **Open Source Community**: For tools and resources
---
<div align="center">
**Built with ❤️ for Sanskrit language preservation and education**
[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
</div>