--- license: mit language: - en library_name: transformers pipeline_tag: text-generation tags: - qwen3 - ssd - self-distillation - rlve --- # Qwen3-1.7B SSD (RLVE Eval20, N=20) — global step 500 Weights merged from VERL FSDP SFT checkpoint **`global_step_500`** (500 optimizer steps, 1 epoch schedule). ## Training data Parquet SFT corpus (16k rows, `messages` column): [CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train](https://huggingface.co/datasets/CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train). ## Load ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "CL-From-Nothing/Qwen3-1-7B-SSD-RLVE-Eval20-N20-global-step-500" tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto", trust_remote_code=True) ``` **Note:** Qwen3 requires `trust_remote_code=True`.