Files
ModelHub XC 4cf0552372 初始化项目,由ModelHub XC社区提供模型
Model: diabolic6045/Sanskrit-qwen-7B-Translate-v2
Source: Original Platform
2026-06-06 23:44:22 +08:00

9.7 KiB
Raw Permalink Blame History

library_name, license, base_model, tags, datasets, model-index
library_name license base_model tags datasets model-index
transformers apache-2.0 Qwen/Qwen2.5-7B-Instruct
sanskrit
translation
transliteration
qwen
axolotl
iast
devanagari
bilingual
diabolic6045/Sanskrit-transliteration-chat-dataset
name results
Sanskrit-qwen-7B-Translate-v2

Sanskrit-qwen-7B-Translate-v2

Sanskrit AI Poster

Sanskrit Model License

A specialized Sanskrit language model for translation and transliteration tasks

🌟 Model Description

This is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct specifically optimized for Sanskrit language processing. The model has been trained using LoRA (Low-Rank Adaptation) on a comprehensive Sanskrit dataset to excel in three key areas:

  1. Sanskrit to IAST Transliteration - Converting Devanagari script to IAST format
  2. Sanskrit to English Translation - Translating Sanskrit text to English
  3. English to Sanskrit Translation - Translating English text to Sanskrit

🚀 Key Features

Multi-Modal Sanskrit Processing

  • IAST Transliteration: Accurate conversion from Devanagari to IAST
  • Bidirectional Translation: Sanskrit ↔ English translation
  • Context-Aware: Preserves meaning and cultural context
  • Chat-Optimized: Uses conversation format for natural interactions

🔧 Technical Improvements Over Previous Model

  • Enhanced Base Model: Upgraded from Qwen2.5-7B-Instruct-1M to Qwen2.5-7B-Instruct
  • Specialized Dataset: Trained on Sanskrit-transliteration-chat-dataset (vs. previous Sanskrit-llama)
  • Chat Template Format: Uses structured conversation format for better performance
  • Optimized LoRA: Improved LoRA configuration with better target modules
  • Memory Efficient: Enhanced with flash attention and gradient checkpointing

📊 Model Specifications

Parameter Value
Base Model Qwen/Qwen2.5-7B-Instruct
Fine-tuning Method LoRA (Low-Rank Adaptation)
LoRA Rank 16
LoRA Alpha 32
Sequence Length 512 tokens
Training Epochs 3
Learning Rate 2e-05
Batch Size 2 (micro) × 4 (gradient accumulation)
Optimizer AdamW 8-bit
Precision bfloat16

🎯 Intended Uses

  • Academic Research: Sanskrit text analysis and translation
  • Educational Tools: Learning Sanskrit through translation
  • Cultural Preservation: Digitizing Sanskrit manuscripts
  • Linguistic Studies: Comparative language analysis
  • Content Creation: Sanskrit-English bilingual content

⚠️ Limitations

  • Experimental Model: Still in development, results may vary
  • Context Sensitivity: Performance depends on text complexity
  • Domain Specific: Optimized for classical Sanskrit texts
  • Verification Required: Important translations should be cross-checked

🛠️ Usage Examples

1. Sanskrit to IAST Transliteration

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Prepare the conversation
messages = [
    {
        "role": "system", 
        "content": "You are a Sanskrit transliteration expert. Convert the given Sanskrit text from Devanagari script to IAST (International Alphabet of Sanskrit Transliteration) format."
    },
    {
        "role": "user", 
        "content": "Transliterate this Sanskrit text to IAST: बुद्धिश्चार्थात्परो लोभः सन्तोषः परमं सुखम् ।"
    }
]

# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)

print(response)
# Output: buddhiścārthātparo lobhaḥ santoṣaḥ paramaṃ sukham |

2. Sanskrit to English Translation

messages = [
    {
        "role": "system", 
        "content": "You are a Sanskrit to English translation expert. Translate the given Sanskrit text accurately while preserving the meaning and context."
    },
    {
        "role": "user", 
        "content": "Translate this Sanskrit text to English: यद॒ग्नौ सूर्ये॑ वि॒षं पृ॑थि॒व्यामोष॑धीषु॒ यत् ।"
    }
]

# Generate translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)

print(response)
# Output: The poison that is in the sun, in the earth and in the herbs...

3. English to Sanskrit Translation

messages = [
    {
        "role": "system", 
        "content": "You are an English to Sanskrit translation expert. Translate the given English text accurately into Sanskrit while preserving the meaning and context."
    },
    {
        "role": "user", 
        "content": "Translate this English text to Sanskrit: May the divine powers protect us and grant us wisdom."
    }
]

# Generate Sanskrit translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)

print(response)
# Output: देवाः अस्मान् रक्षन्तु बुद्धिं च प्रयच्छन्तु ।

🎮 Interactive Demo

Try the model with our Gradio interface:

Run the interactive demo

The demo provides:

  • Mode Selection: Choose between transliteration and translation modes
  • Real-time Processing: Instant results with adjustable parameters
  • Example Library: Pre-loaded examples for each mode
  • Parameter Tuning: Adjust temperature and max length

📈 Training Details

Dataset Information

  • Source: diabolic6045/Sanskrit-transliteration-chat-dataset
  • Format: Chat template with structured conversations
  • Size: Comprehensive Sanskrit corpus with multiple translation pairs
  • Validation Split: 10% for evaluation

Training Configuration

# Key training parameters
base_model: Qwen/Qwen2.5-7B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
sequence_len: 512
num_epochs: 3
learning_rate: 0.00002
optimizer: adamw_8bit
lr_scheduler: cosine
bf16: auto
flash_attention: true
gradient_checkpointing: true

Hardware Requirements

  • Training: Multi-GPU setup with 24GB+ VRAM per GPU
  • Inference: 8GB+ VRAM for optimal performance
  • CPU: Compatible with CPU inference (slower)

🔄 Comparison with Previous Model

Feature Previous Model Current Model
Base Model Qwen2.5-7B-Instruct-1M Qwen2.5-7B-Instruct
Dataset Sanskrit-llama (Alpaca) Sanskrit-transliteration-chat-dataset
Format Alpaca format Chat template format
Capabilities Basic translation Multi-modal (transliteration + translation)
LoRA Rank 32 16 (optimized)
Sequence Length 1024 512 (focused)
Training Epochs 1 3 (more thorough)
Specialization General Sanskrit Specialized for transliteration

🛡️ Ethical Considerations

  • Cultural Sensitivity: Respect for Sanskrit's cultural and religious significance
  • Accuracy Disclaimer: Model outputs should be verified for important translations
  • Educational Use: Primarily intended for educational and research purposes
  • Bias Awareness: May reflect biases present in training data

📚 Citation

If you use this model in your research, please cite:

@misc{sanskrit-qwen-chat-lora,
  title={Sanskrit-qwen-7B-Translate-v2: A Specialized Sanskrit Translation and Transliteration Model},
  author={Divax Shah (diabolic6045)},
  year={2024},
  url={https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2}
}

🤝 Contributing

We welcome contributions to improve this model:

  1. Dataset Contributions: High-quality Sanskrit translation pairs
  2. Evaluation: Benchmarking and performance analysis
  3. Bug Reports: Issues and improvement suggestions
  4. Documentation: Usage examples and tutorials

📄 License

This model is released under the Apache 2.0 License. See the LICENSE file for details.

🙏 Acknowledgments

  • Qwen Team: For the excellent base model
  • Axolotl Framework: For the training infrastructure
  • Sanskrit Community: For linguistic guidance and feedback
  • Open Source Community: For tools and resources

Built with ❤️ for Sanskrit language preservation and education

Built with Axolotl