Files
Qwen3-8B-FR/README.md
ModelHub XC aeb152e538 初始化项目,由ModelHub XC社区提供模型
Model: lightonai/Qwen3-8B-FR
Source: Original Platform
2026-06-03 20:57:07 +08:00

96 lines
4.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
library_name: transformers
base_model: Qwen/Qwen3-8B-Base
tags:
- multilingual
- reasoning
- LLM
- qwen3
license: apache-2.0
datasets:
- lightonai/Dolci-Think-SFT-32B-Multilingual
language:
- fr
pipeline_tag: text-generation
---
# Qwen3-8B-FR
`Qwen3-8B-FR` is a **native reasoning model** fine-tuned from [`Qwen/Qwen3-8B-Base`](https://huggingface.co/Qwen/Qwen3-8B-Base) to reason in French. This model produces its **entire reasoning trace in French** before delivering the final answer in French.
It is released alongside the paper [**Rethinking the Multilingual Reasoning Gap with Layer Swap**](https://arxiv.org/abs/2605.26735).
## Model details
- **Base model:** `Qwen/Qwen3-8B-Base`
- **Language:** French (CoT and answer)
- **Training:** Full SFT, ~10B tokens, 2 epochs
- **Context length:** 32,768 tokens
- **Dataset:** [`lightonai/Dolci-Think-SFT-32B-Multilingual`](https://huggingface.co/datasets/lightonai/Dolci-Think-SFT-32B-Multilingual) (French split).
> [!NOTE]
> The model was trained on data derived from `allenai/Dolci-Think-SFT-32B`, released under the ODC-BY-1.0 license.
## Related models
This model is part of a French specialist trio designed to study the native reasoning gap:
| Model | CoT language | Description |
|---|---|---|
| [`lightonai/Qwen3-8B-FR`](https://huggingface.co/lightonai/Qwen3-8B-FR) | French | Native reasoning specialist |
| [`lightonai/Qwen3-8B-FR-Swap`](https://huggingface.co/lightonai/Qwen3-8B-FR-Swap) | French | Layer Swap: middle layers (L13L22) of `Qwen3-8B-EN` transplanted into `Qwen3-8B-FR` |
| [`lightonai/Qwen3-8B-FR-Pivot-EN`](https://huggingface.co/lightonai/Qwen3-8B-FR-Pivot-EN) | English | Same French Q&A pairs, but CoT in English |
| [`lightonai/Qwen3-8B-EN`](https://huggingface.co/lightonai/Qwen3-8B-EN) | English | English specialist |
## Evaluation
All scores are mean accuracy (%) on the **French** version of each benchmark, with sample standard deviation across runs. AIME 24/25 is averaged over 30 runs; the others over 10 runs, using the recommended generation parameters.
| Model | MGSM-Rev2 | Global-MMLU-Lite | GPQA-Diamond | AIME 24/25 | HumanEvalPlus | Average |
|---|:---:|:---:|:---:|:---:|:---:|:---:|
| `Qwen3-8B-FR` | 92.80 | 76.45 | 53.59 | 55.67 | 83.31 | 72.36 |
| `Qwen3-8B-FR-Swap` | <u>97.40</u> | 76.57 | 54.55 | 59.11 | <u>86.06</u> | 74.74 |
| `Qwen3-8B-FR-Pivot-EN` | 94.52 | <u>78.37</u> | <u>54.65</u> | <u>62.78</u> | 84.88 | <u>75.04</u> |
| `Qwen3-8B-EN` | 95.72 | 77.50 | 52.53 | 61.39 | 84.19 | 74.27 |
**Benchmarks used:**
- [`lightonai/gpqa_diamond_multilingual`](https://huggingface.co/datasets/lightonai/gpqa_diamond_multilingual)
- [`lightonai/aime24_multilingual`](https://huggingface.co/datasets/lightonai/aime24_multilingual)
- [`lightonai/aime25_multilingual`](https://huggingface.co/datasets/lightonai/aime25_multilingual)
- [`lightonai/HumanEvalPlus_multilingual`](https://huggingface.co/datasets/lightonai/HumanEvalPlus_multilingual)
- [`lightonai/mgsm-rev2`](https://huggingface.co/datasets/lightonai/mgsm-rev2)
- [`CohereLabs/Global-MMLU-Lite`](https://huggingface.co/datasets/CohereLabs/Global-MMLU-Lite)
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "lightonai/Qwen3-8B-FR"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
messages = [{"role": "user", "content": "Résous : 24 × 17 = ?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=32768, temperature=1.0, top_p=0.95, top_k=20)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
```
Recommended sampling: `temperature=1.0`, `top_p=0.95`, `top_k=20`, `min_p=0`.
## Citation
If you find our work helpful, feel free to give us a cite.
```bibtex
@misc{lasbordes2026rethinking,
title = {Rethinking the Multilingual Reasoning Gap with Layer Swap},
author = {Lasbordes, Maxence and Chatelain, Amélie and Seddah, Djamé},
year = {2026},
eprint = {2605.26735},
archivePrefix= {arXiv},
primaryClass = {cs.CL}
}
```