optimal-sparsity-code-d512-…/README.md

---
pipeline_tag: text-generation
library_name: transformers
license: apache-2.0
tags:
  - mixtral
  - moe
  - reasoning
---

# Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

This repository contains model checkpoints from the paper [Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks](https://huggingface.co/papers/2508.18672).

For more details, including code and evaluation procedures, please refer to the official GitHub repository: [https://github.com/rioyokotalab/optimal-sparsity](https://github.com/rioyokotalab/optimal-sparsity)

## How to cite

If you find our work helpful, please feel free to cite the paper.

```bibtex
@inproceedings{
    nakamura2026optimal,
    title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks},
    author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=XFw2EPRUUR}
}
```