Files
MoE-Qwen-4x1.8B-pretrain-50…/README.md
ModelHub XC aad76e555c 初始化项目,由ModelHub XC社区提供模型
Model: chestnutlzj/MoE-Qwen-4x1.8B-pretrain-50000-ckpt
Source: Original Platform
2026-04-19 21:35:02 +08:00

1.6 KiB

license, language
license language
apache-2.0
zh

4x1.8B MoE Qwen Ckpt 50000

This is a MoE model project constructed based on the Qwen 1.8B model. In this project, we concatenated 4 original models and trained them using special training methods.

This model is a checkpoint model for the continue pretraining stage.

Evaluations

Groups n-shot Metric Value Stderr
boolq 0 acc 0.6508 ± 0.0083
ceval-valid 0 acc 0.5290 ± 0.1912
0 acc_norm 0.5290 ± 0.1912
cmmlu 0 acc 0.5087 ± 0.1237
0 acc_norm 0.5087 ± 0.1237
mathqa 0 acc 0.2647 ± 0.0081
0 acc_norm 0.2693 ± 0.0081
mmlu 0 acc 0.4353 ± 0.0830
- stem 0 acc 0.3809 ± 0.0659
- social_sciences 0 acc 0.4959 ± 0.0708
- other 0 acc 0.4844 ± 0.0744
- humanities 0 acc 0.3998 ± 0.0849

Acknowledgements

License Agreement

This project is open source under the Tongyi Qianwen Research License Agreement. You can view the complete license agreement in this link: [https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT].

During the use of this project, please ensure that your usage behavior complies with the terms and conditions of the license agreement.