42 lines
1.8 KiB
Markdown
42 lines
1.8 KiB
Markdown
|
|
---
|
||
|
|
base_model: dnotitia/Smoothie-Qwen3-14B
|
||
|
|
library_name: transformers
|
||
|
|
pipeline_tag: text-generation
|
||
|
|
license: apache-2.0
|
||
|
|
---
|
||
|
|
|
||
|
|
<p align="center">
|
||
|
|
<img src="assets/dna-2.1-logo.png" width="400" style="margin: 40px auto;">
|
||
|
|
</p>
|
||
|
|
|
||
|
|
# DNA 2.1
|
||
|
|
|
||
|
|
**DNA 2.1** is a fine-tuned Qwen3 14B model that thinks natively in Korean through a two-stage training approach. This model is released alongside the paper [Making Qwen3 Think in Korean with Reinforcement Learning](https://arxiv.org/abs/2508.10355).
|
||
|
|
|
||
|
|
## Key Features
|
||
|
|
|
||
|
|
- **Two-Stage Training Approach**: Supervised fine-tuning (SFT) on high-quality Korean reasoning datasets followed by reinforcement learning with our proposed **Oracle-Guided Dr. GRPO** algorithm
|
||
|
|
- **Native Korean Thinking**: Conducts internal chain-of-thought reasoning entirely in Korean
|
||
|
|
- **Stable RL Training**: Addresses reward hacking and policy collapse through oracle judge model for reward signal calibration
|
||
|
|
- **Enhanced Reasoning Performance**: Substantially improved results on advanced reasoning benchmarks, particularly in math and coding tasks
|
||
|
|
- **Preserved Knowledge & Language Proficiency**: Maintains existing knowledge and language capabilities after reinforcement learning
|
||
|
|
|
||
|
|
## Base Model
|
||
|
|
|
||
|
|
This model builds upon [Smoothie Qwen3](https://huggingface.co/collections/dnotitia/smoothie-qwen3-6811896ebb3a255de7b5b437), which reduces Chinese token emission probabilities and enhances Korean reasoning capabilities.
|
||
|
|
|
||
|
|
## Citation
|
||
|
|
|
||
|
|
If you use this model in your research, please cite our paper:
|
||
|
|
|
||
|
|
```bibtex
|
||
|
|
@misc{lee2025makingqwen3thinkkorean,
|
||
|
|
title={Making Qwen3 Think in Korean with Reinforcement Learning},
|
||
|
|
author={Jungyup Lee and Jemin Kim and Sang Park and SeungJae Lee},
|
||
|
|
year={2025},
|
||
|
|
eprint={2508.10355},
|
||
|
|
archivePrefix={arXiv},
|
||
|
|
primaryClass={cs.CL},
|
||
|
|
url={https://arxiv.org/abs/2508.10355},
|
||
|
|
}
|
||
|
|
```
|