ModelHub XC 9e3ca5e370 初始化项目,由ModelHub XC社区提供模型
Model: yamaTK/dpo-qwen-y-v35
Source: Original Platform
2026-05-22 04:44:18 +08:00

base_model, datasets, language, license, library_name, pipeline_tag, tags
base_model datasets language license library_name pipeline_tag tags
Qwen/Qwen3-4B-Instruct-2507
u-10bei/dpo-dataset-qwen-cot
en
apache-2.0 transformers text-generation
dpo
unsloth
qwen
alignment

dpo-qwen-y-v35

DPO fine-tuned version of Qwen/Qwen3-4B-Instruct-2507. Full-merged 16-bit weights. No adapter loading required.

Training Configuration

  • Method: DPO
  • Epochs: 1
  • Learning rate: 1e-07
  • Beta: 0.1
  • Max sequence length: 1024
Description
Model synced from source: yamaTK/dpo-qwen-y-v35
Readme 2 MiB
Languages
Jinja 100%