optimal-sparsity-math-d1024…/README.md

---
pipeline_tag: text-generation
library_name: transformers
license: apache-2.0
tags:
  - mixtral
  - moe
  - reasoning
---

# Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672).

For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity)

## How to cite

If you find our work helpful, please feel free to cite the paper.

```bibtex
@inproceedings{
    nakamura2026optimal,
    title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks},
    author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=XFw2EPRUUR}
}
```
初始化项目，由ModelHub XC社区提供模型 Model: llm-jp/optimal-sparsity-math-d1024-E32-k16-3.5B-A1.9B Source: Original Platform 2026-05-06 18:20:34 +08:00			`---`
			`pipeline_tag: text-generation`
			`library_name: transformers`
			`license: apache-2.0`
			`tags:`
			`- mixtral`
			`- moe`
			`- reasoning`
			`---`

			`# Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks`

			`This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672).`

			`For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity)`

			`## How to cite`

			`If you find our work helpful, please feel free to cite the paper.`

			```bibtex
			`@inproceedings{`
			`nakamura2026optimal,`
			`title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks},`
			`author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota},`
			`booktitle={The Fourteenth International Conference on Learning Representations},`
			`year={2026},`
			`url={https://openreview.net/forum?id=XFw2EPRUUR}`
			`}`
			```