traversaal-2.5-Mistral-7B/README.md

---
license: apache-2.0
datasets:
- Intel/orca_dpo_pairs
---

traversaal-2.5-Mistral-7B is trained via Direct Preference Optimization(DPO) from teknium/OpenHermes-2.5-Mistral-7B as its base model, with several optimizations in hyperparameters.
teknium/OpenHermes-2.5-Mistral-7B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model.
Note that we did not exploit any form of weight merge.
For leaderboard submission, the trained weight is realigned for compatibility with Mistral-7b
初始化项目，由ModelHub XC社区提供模型 Model: traversaal-ai/traversaal-2.5-Mistral-7B Source: Original Platform 2026-05-25 02:28:17 +08:00			`---`
			`license: apache-2.0`
			`datasets:`
			`- Intel/orca_dpo_pairs`
			`---`

			`traversaal-2.5-Mistral-7B is trained via Direct Preference Optimization(DPO) from teknium/OpenHermes-2.5-Mistral-7B as its base model, with several optimizations in hyperparameters.`
			`teknium/OpenHermes-2.5-Mistral-7B is trained via Supervised Fine-Tuning (SFT) using LoRA, with the QWEN-72B model as its base-model.`
			`Note that we did not exploit any form of weight merge.`
			`For leaderboard submission, the trained weight is realigned for compatibility with Mistral-7b`