---
library_name: transformers
base_model: Qwen/Qwen3-8B-Base
tags:
- multilingual
- reasoning
- LLM
- qwen3
license: apache-2.0
datasets:
- lightonai/Dolci-Think-SFT-32B-Multilingual
language:
- zh
pipeline_tag: text-generation
---

# Qwen3-8B-ZH

`Qwen3-8B-ZH` is a **native reasoning model** fine-tuned from [`Qwen/Qwen3-8B-Base`](https://huggingface.co/Qwen/Qwen3-8B-Base) to reason in Chinese. This model produces its **entire reasoning trace in Chinese** before delivering the final answer in Chinese.

It is released alongside the paper [**Rethinking the Multilingual Reasoning Gap with Layer Swap**](https://arxiv.org/abs/2605.26735).

## Model details

- **Base model:** `Qwen/Qwen3-8B-Base`
- **Language:** Chinese (CoT and answer)
- **Training:** Full SFT, ~10B tokens, 2 epochs
- **Context length:** 32,768 tokens
- **Dataset:** [`lightonai/Dolci-Think-SFT-32B-Multilingual`](https://huggingface.co/datasets/lightonai/Dolci-Think-SFT-32B-Multilingual) (Chinese split).

> [!NOTE]
> The model was trained on data derived from `allenai/Dolci-Think-SFT-32B`, released under the ODC-BY-1.0 license.

## Related models

This model is part of a Chinese specialist trio designed to study the native reasoning gap:

| Model | CoT language | Description |
|---|---|---|
| [`lightonai/Qwen3-8B-ZH`](https://huggingface.co/lightonai/Qwen3-8B-ZH) | Chinese | Native reasoning specialist |
| [`lightonai/Qwen3-8B-ZH-Swap`](https://huggingface.co/lightonai/Qwen3-8B-ZH-Swap) | Chinese | Layer Swap: middle layers (L13–L20) of `Qwen3-8B-EN` transplanted into `Qwen3-8B-ZH` |
| [`lightonai/Qwen3-8B-ZH-Pivot-EN`](https://huggingface.co/lightonai/Qwen3-8B-ZH-Pivot-EN) | English | Same Chinese Q&A pairs, but CoT in English |
| [`lightonai/Qwen3-8B-EN`](https://huggingface.co/lightonai/Qwen3-8B-EN) | English | English specialist |

## Evaluation

All scores are mean accuracy (%) on the **Chinese** version of each benchmark, with sample standard deviation across runs. AIME 24/25 is averaged over 30 runs; the others over 10 runs, using the recommended generation parameters.

| Model | MGSM-Rev2 | Global-MMLU-Lite | GPQA-Diamond | AIME 24/25 | HumanEvalPlus | Average |
|---|:---:|:---:|:---:|:---:|:---:|:---:|
| `Qwen3-8B-ZH` | 88.92 | 74.85 | 50.71 | 53.89 | 85.62 | 70.80 |
| `Qwen3-8B-ZH-Swap` | 88.24 | <u>76.42</u> | 52.58 | 55.17 | <u>85.69</u> | 71.62 |
| `Qwen3-8B-ZH-Pivot-EN` | <u>94.84</u> | 76.15 | <u>54.19</u> | <u>59.06</u> | 85.19 | <u>73.89</u> |
| `Qwen3-8B-EN` | 76.04 | 75.00 | 47.53 | 50.00 | 83.88 | 66.49 |

**Benchmarks used:**

- [`lightonai/gpqa_diamond_multilingual`](https://huggingface.co/datasets/lightonai/gpqa_diamond_multilingual)
- [`lightonai/aime24_multilingual`](https://huggingface.co/datasets/lightonai/aime24_multilingual)
- [`lightonai/aime25_multilingual`](https://huggingface.co/datasets/lightonai/aime25_multilingual)
- [`lightonai/HumanEvalPlus_multilingual`](https://huggingface.co/datasets/lightonai/HumanEvalPlus_multilingual)
- [`lightonai/mgsm-rev2`](https://huggingface.co/datasets/lightonai/mgsm-rev2)
- [`CohereLabs/Global-MMLU-Lite`](https://huggingface.co/datasets/CohereLabs/Global-MMLU-Lite)

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "lightonai/Qwen3-8B-ZH"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

messages = [{"role": "user", "content": "计算：24 × 17 = ?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

outputs = model.generate(inputs, max_new_tokens=32768, temperature=1.0, top_p=0.95, top_k=20)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
```

Recommended sampling: `temperature=1.0`, `top_p=0.95`, `top_k=20`, `min_p=0`.

## Citation

If you find our work helpful, feel free to give us a cite.

```bibtex
@misc{lasbordes2026rethinking,
  title        = {Rethinking the Multilingual Reasoning Gap with Layer Swap},
  author       = {Lasbordes, Maxence and Chatelain, Amélie and Seddah, Djamé},
  year         = {2026},
  eprint       = {2605.26735},
  archivePrefix= {arXiv},
  primaryClass = {cs.CL}
}
```