45cb85c2cbcc210b2f66c721f1f5e1f6f1d15605
Model: llm-jp/optimal-sparsity-code-d1024-E16-k4-1.9B-A670M Source: Original Platform
pipeline_tag, library_name, license, tags
| pipeline_tag | library_name | license | tags | |||
|---|---|---|---|---|---|---|
| text-generation | transformers | apache-2.0 |
|
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks
This repository contains model checkpoints from the paper Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks.
For more details, including code and evaluation procedures, please refer to the official GitHub repository: https://github.com/rioyokotalab/optimal-sparsity
How to cite
If you find our work helpful, please feel free to cite the paper.
@inproceedings{
nakamura2026optimal,
title={Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks},
author={Taishi Nakamura and Satoki Ishikawa and Masaki Kawamura and Takumi Okamoto and Daisuke Nohara and Jun Suzuki and Rio Yokota},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=XFw2EPRUUR}
}
Description