96 lines
3.6 KiB
Markdown
96 lines
3.6 KiB
Markdown
|
|
---
|
|||
|
|
license: apache-2.0
|
|||
|
|
base_model: openeurollm/OLMo-3-7B-Instruct-SFT
|
|||
|
|
language:
|
|||
|
|
- en
|
|||
|
|
- cs
|
|||
|
|
- de
|
|||
|
|
- es
|
|||
|
|
- fi
|
|||
|
|
- fr
|
|||
|
|
- it
|
|||
|
|
- sv
|
|||
|
|
library_name: transformers
|
|||
|
|
tags:
|
|||
|
|
- olmo
|
|||
|
|
- sft
|
|||
|
|
- multilingual
|
|||
|
|
- european-languages
|
|||
|
|
- dolci-translated
|
|||
|
|
- continued-sft
|
|||
|
|
datasets:
|
|||
|
|
- allenai/Dolci-Instruct-SFT
|
|||
|
|
- openeurollm/Dolci-Instruct-SFT-translated
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
# OLMo-3-7B Dolci-Translated A-75EN
|
|||
|
|
|
|||
|
|
Continued-SFT of `openeurollm/OLMo-3-7B-Instruct-SFT` on a $75/25$ English:EU
|
|||
|
|
mixture, the headline configuration of the paper *Translate, Replay, Mix:
|
|||
|
|
Exploring Multilingual Post-Training for Low-Resource European Languages*.
|
|||
|
|
|
|||
|
|
- Qualitative completions viewer: https://ferreirafabio.github.io/olmo3-multilingual-dolci-sft-progression/
|
|||
|
|
|
|||
|
|
## Recipe
|
|||
|
|
|
|||
|
|
| | |
|
|||
|
|
|---|---|
|
|||
|
|
| **Base checkpoint** | `openeurollm/OLMo-3-7B-Instruct-SFT` (our reproduction at parity with `allenai/OLMo-3-7B-Instruct-SFT`) |
|
|||
|
|
| **English half (Dolci replay)** | `allenai/Dolci-Instruct-SFT`, 75% of the mixture |
|
|||
|
|
| **EU half (Dolci-Translated)** | `openeurollm/Dolci-Instruct-SFT-translated`, 25% of the mixture, 7 EU languages translated with `gemma-3-27b-it` |
|
|||
|
|
| **EU languages** | cs, de, es, fi, fr, it, sv |
|
|||
|
|
| **Total samples** | 2.87M (2,152,112 en + 717,370 EU) |
|
|||
|
|
| **Final step** | 3998 |
|
|||
|
|
| **Chat template** | `olmo` (inherited from base) |
|
|||
|
|
|
|||
|
|
## Training configuration
|
|||
|
|
|
|||
|
|
- Optimiser: AdamW, $\beta_1=0.9$, $\beta_2=0.95$
|
|||
|
|
- Peak learning rate: $8\times10^{-5}$, linear warm-up + cosine decay
|
|||
|
|
- Effective batch size: ${\sim}1$M tokens per step
|
|||
|
|
- Sequence length: 32,768
|
|||
|
|
- Precision: BF16
|
|||
|
|
- DeepSpeed ZeRO stage 2
|
|||
|
|
- Hardware: 8 × NVIDIA H200 SXM (HoreKa), $2\times4$ topology
|
|||
|
|
- Training framework: OLMo-core via the `open-instruct` fork
|
|||
|
|
|
|||
|
|
## Evaluation
|
|||
|
|
|
|||
|
|
Bradley-Terry Elo (Qwen3.5-27B judge, LMArena BT implementation, 500
|
|||
|
|
battles/language, 100 bootstrap resamples):
|
|||
|
|
|
|||
|
|
| Metric | A-75EN (this checkpoint) | Baseline (`openeurollm/OLMo-3-7B-Instruct-SFT`) |
|
|||
|
|
|---------------------|--------------------------|-------------------------------------------------|
|
|||
|
|
| Overall Elo | $789 \pm 7$ | $762 \pm 7$ |
|
|||
|
|
| English Elo | $\mathbf{950 \pm 14}$ | $954 \pm 16$ |
|
|||
|
|
| Non-English Elo | $\mathbf{755 \pm 8}$ | $697 \pm 9$ |
|
|||
|
|
|
|||
|
|
Per-language Elo (cs / de / es / fi / fr / it / sv):
|
|||
|
|
|
|||
|
|
| en | cs | de | es | fi | fr | it | sv |
|
|||
|
|
|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
|
|||
|
|
| $950 \pm 14$ | $714 \pm 19$ | $690 \pm 24$ | $746 \pm 18$ | $732 \pm 44$ | $743 \pm 17$ | $\mathbf{820 \pm 15}$ | $722 \pm 35$ |
|
|||
|
|
|
|||
|
|
A-75EN preserves English Elo within CI of baseline and improves on every EU
|
|||
|
|
language except Swedish, with the largest gain on Italian. Full per-language
|
|||
|
|
breakdown and the comparison to A-25EN are in the paper, Tables 2 and 3.
|
|||
|
|
|
|||
|
|
## Intermediate checkpoints
|
|||
|
|
|
|||
|
|
Training-step revisions (`step500`, `step1500`, `step2500`, `step3500`) are
|
|||
|
|
available as HF git revisions of this repo (loaded via `revision="step1500"`)
|
|||
|
|
and back the qualitative completions viewer at https://ferreirafabio.github.io/olmo3-multilingual-dolci-sft-progression/.
|
|||
|
|
|
|||
|
|
## How to load
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|||
|
|
tok = AutoTokenizer.from_pretrained("openeurollm/OLMo-3-7B-Dolci-Translated-A-75EN")
|
|||
|
|
model = AutoModelForCausalLM.from_pretrained("openeurollm/OLMo-3-7B-Dolci-Translated-A-75EN", torch_dtype="bfloat16")
|
|||
|
|
# tok.chat_template is set; use tok.apply_chat_template(...) directly
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Citation
|
|||
|
|
|
|||
|
|
Please cite the paper and the OLMo-3 family if you use this checkpoint.
|