ModelHub XC 61b848acf9 初始化项目,由ModelHub XC社区提供模型
Model: SamsungSAILMontreal/Qwen3-30B-A3B-Instruct-2507-REAP
Source: Original Platform
2026-05-09 12:20:39 +08:00

library_name, tags, license, base_model
library_name tags license base_model
transformers
compression
expert-merging
moe
apache-2.0
Qwen/Qwen3-30B-A3B-Instruct-2507

Qwen3-30B-A3B-Instruct-2507-REAP

This model is a compressed version of Qwen/Qwen3-30B-A3B-Instruct-2507. It is obtained by reducing the number of experts in each MoE layer from 128 to 96 using the REAP baseline method as described in https://bknyaz.github.io/blog/2026/moe/. The compressed model has 23B params (44GB) instead of 31B (57GB) of the original model, reducing storage and GPU memory requirements by roughly 25%. At the same time, the model retains >=90% of the original model's performance on a variety of benchmarks (see Results section below). Additional efficiency optimization (e.g., quantization) can be added similarly to the original model.

See additional details at Qwen3-30B-A3B-Instruct-2507-REAM.

Results

Model Winogrande ARC-C ARC-E BoolQ HellaSwag MMLU OpenBookQA RTE AVG
Qwen3-30B-A3B-Instruct-2507 73.2 60.7 85.1 88.7 61.2 80.1 32.4 76.5 69.7
Qwen3-30B-A3B-Instruct-2507-REAP 71.7 49.3 77.4 88.1 56.5 69.3 29.6 78.3 65.0
Model IFeval AIME25 GSM8K GPQA-D HumanEval LiveCodeBench AVG
Qwen3-30B-A3B-Instruct-2507 90.4 56.7 89.3 47.0 93.3 48.6 70.9
Qwen3-30B-A3B-Instruct-2507-REAP 88.0 56.7 87.9 37.9 81.7 33.0 64.2

License

Please refer to the license of the original model Qwen/Qwen3-30B-A3B-Instruct-2507.

Description
Model synced from source: SamsungSAILMontreal/Qwen3-30B-A3B-Instruct-2507-REAP
Readme 2 MiB
Languages
Jinja 100%