40 lines
1.6 KiB
Markdown
40 lines
1.6 KiB
Markdown
---
|
|
license: apache-2.0
|
|
language:
|
|
- zh
|
|
---
|
|
# 4x1.8B MoE Qwen Ckpt 50000
|
|
|
|
This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.
|
|
|
|
This model is a checkpoint model for the continue pretraining stage.
|
|
|
|

|
|
|
|
# Evaluations
|
|
|
|
| Groups |n-shot| Metric |Value | |Stderr|
|
|
|------------------|-----:|--------|-----:|---|-----:|
|
|
|boolq | 0|acc |0.6508|± |0.0083|
|
|
|ceval-valid | 0|acc |0.5290|± |0.1912|
|
|
| | 0|acc_norm|0.5290|± |0.1912|
|
|
|cmmlu | 0|acc |0.5087|± |0.1237|
|
|
| | 0|acc_norm|0.5087|± |0.1237|
|
|
|mathqa | 0|acc |0.2647|± |0.0081|
|
|
| | 0|acc_norm|0.2693|± |0.0081|
|
|
|mmlu | 0|acc |0.4353|± |0.0830|
|
|
| - stem | 0|acc |0.3809|± |0.0659|
|
|
| - social_sciences| 0|acc |0.4959|± |0.0708|
|
|
| - other | 0|acc |0.4844|± |0.0744|
|
|
| - humanities | 0|acc |0.3998|± |0.0849|
|
|
|
|
# Acknowledgements
|
|
|
|
+ [Qwen](https://github.com/QwenLM/Qwen)
|
|
+ [mistral.ai](https://mistral.ai)
|
|
|
|
# License Agreement
|
|
|
|
This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].
|
|
|
|
During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement. |