Model: CL-From-Nothing/Qwen3-4B-SSD-RLVE-Eval20-N20-global-step-500 Source: Original Platform
license, language, library_name, pipeline_tag, base_model, tags
| license | language | library_name | pipeline_tag | base_model | tags | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| mit |
|
transformers | text-generation | Qwen/Qwen3-4B |
|
Qwen3-4B SSD (RLVE Eval20, N=20) — global step 500
Weights merged from VERL FSDP SFT checkpoint global_step_500 (500 optimizer steps, 1 epoch schedule) of
Simple Self-Distillation (SSD) applied to Qwen/Qwen3-4B:
sample N=20 self-generated responses from the frozen base model, then SFT on those samples.
Training data
Parquet SFT corpus (16k rows, messages column):
CL-From-Nothing/RLVE-Eval20-Qwen3-4B-SSD-N20-SFT-Train.
Companion 1.7B model: CL-From-Nothing/Qwen3-1-7B-SSD-RLVE-Eval20-N20-global-step-500.
Load
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "CL-From-Nothing/Qwen3-4B-SSD-RLVE-Eval20-N20-global-step-500"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
Description
Languages
Jinja
100%