base_model, tags, license, language, datasets, pipeline_tag, library_name
base_model tags license language datasets pipeline_tag library_name
unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit
unsloth
qwen2
trl
grpo
apache-2.0
en
openai/gsm8k
text-generation transformers

Qwen2.5-3B-Reasoning-gsm8k-v1

  • Developed by: nomadicsynth
  • License: apache-2.0
  • Finetuned from model: unsloth/Qwen2.5-3B-Instruct-unsloth-bnb-4bit
  • Training Notebook: Qwen2.5_(3B)-GRPO.ipynb

This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Description
Model synced from source: nomadicsynth/Qwen2.5-3B-Instruct-Reasoning-gsm8k-v1
Readme 2 MiB