Files
ModelHub XC 44a256cfa1 初始化项目,由ModelHub XC社区提供模型
Model: llm-jp/optimal-sparsity-math-d512-E64-k16-1.7B-A520M
Source: Original Platform
2026-04-13 03:19:54 +08:00

1.1 KiB

pipeline_tag, library_name, license, tags
pipeline_tag library_name license tags
text-generation transformers apache-2.0
mixtral
moe
reasoning

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

This repository contains model checkpoints from the paper Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks.

For more details, including code and evaluation procedures, please refer to the official GitHub repository: https://github.com/rioyokotalab/optimal-sparsity

How to cite

If you find our work helpful, please feel free to cite the paper.

@inproceedings{
    nakamura2026optimal,
    title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks},
    author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=XFw2EPRUUR}
}