ModelHub XC a61bf1dd63 初始化项目,由ModelHub XC社区提供模型
Model: dnotitia/DNA-2.1-14B
Source: Original Platform
2026-05-07 08:26:47 +08:00

base_model, library_name, pipeline_tag, license
base_model library_name pipeline_tag license
dnotitia/Smoothie-Qwen3-14B transformers text-generation apache-2.0

DNA 2.1

DNA 2.1 is a fine-tuned Qwen3 14B model that thinks natively in Korean through a two-stage training approach. This model is released alongside the paper Making Qwen3 Think in Korean with Reinforcement Learning.

Key Features

  • Two-Stage Training Approach: Supervised fine-tuning (SFT) on high-quality Korean reasoning datasets followed by reinforcement learning with our proposed Oracle-Guided Dr. GRPO algorithm
  • Native Korean Thinking: Conducts internal chain-of-thought reasoning entirely in Korean
  • Stable RL Training: Addresses reward hacking and policy collapse through oracle judge model for reward signal calibration
  • Enhanced Reasoning Performance: Substantially improved results on advanced reasoning benchmarks, particularly in math and coding tasks
  • Preserved Knowledge & Language Proficiency: Maintains existing knowledge and language capabilities after reinforcement learning

Base Model

This model builds upon Smoothie Qwen3, which reduces Chinese token emission probabilities and enhances Korean reasoning capabilities.

Citation

If you use this model in your research, please cite our paper:

@misc{lee2025makingqwen3thinkkorean,
      title={Making Qwen3 Think in Korean with Reinforcement Learning}, 
      author={Jungyup Lee and Jemin Kim and Sang Park and SeungJae Lee},
      year={2025},
      eprint={2508.10355},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.10355}, 
}
Description
Model synced from source: dnotitia/DNA-2.1-14B
Readme 2.8 MiB
Languages
Jinja 100%