初始化项目,由ModelHub XC社区提供模型
Model: chestnutlzj/MoE-Qwen-4x1.8B-pretrain-50000-ckpt Source: Original Platform
This commit is contained in:
40
README.md
Normal file
40
README.md
Normal file
@@ -0,0 +1,40 @@
|
||||
---
|
||||
license: apache-2.0
|
||||
language:
|
||||
- zh
|
||||
---
|
||||
# 4x1.8B MoE Qwen Ckpt 50000
|
||||
|
||||
This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.
|
||||
|
||||
This model is a checkpoint model for the continue pretraining stage.
|
||||
|
||||

|
||||
|
||||
# Evaluations
|
||||
|
||||
| Groups |n-shot| Metric |Value | |Stderr|
|
||||
|------------------|-----:|--------|-----:|---|-----:|
|
||||
|boolq | 0|acc |0.6508|± |0.0083|
|
||||
|ceval-valid | 0|acc |0.5290|± |0.1912|
|
||||
| | 0|acc_norm|0.5290|± |0.1912|
|
||||
|cmmlu | 0|acc |0.5087|± |0.1237|
|
||||
| | 0|acc_norm|0.5087|± |0.1237|
|
||||
|mathqa | 0|acc |0.2647|± |0.0081|
|
||||
| | 0|acc_norm|0.2693|± |0.0081|
|
||||
|mmlu | 0|acc |0.4353|± |0.0830|
|
||||
| - stem | 0|acc |0.3809|± |0.0659|
|
||||
| - social_sciences| 0|acc |0.4959|± |0.0708|
|
||||
| - other | 0|acc |0.4844|± |0.0744|
|
||||
| - humanities | 0|acc |0.3998|± |0.0849|
|
||||
|
||||
# Acknowledgements
|
||||
|
||||
+ [Qwen](https://github.com/QwenLM/Qwen)
|
||||
+ [mistral.ai](https://mistral.ai)
|
||||
|
||||
# License Agreement
|
||||
|
||||
This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].
|
||||
|
||||
During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.
|
||||
Reference in New Issue
Block a user