Model: CL-From-Nothing/Qwen3-1-7B-SSD-RLVE-Eval20-N20-global-step-500 Source: Original Platform
33 lines
932 B
Markdown
33 lines
932 B
Markdown
---
|
|
license: mit
|
|
language:
|
|
- en
|
|
library_name: transformers
|
|
pipeline_tag: text-generation
|
|
tags:
|
|
- qwen3
|
|
- ssd
|
|
- self-distillation
|
|
- rlve
|
|
---
|
|
|
|
# Qwen3-1.7B SSD (RLVE Eval20, N=20) — global step 500
|
|
|
|
Weights merged from VERL FSDP SFT checkpoint **`global_step_500`** (500 optimizer steps, 1 epoch schedule).
|
|
|
|
## Training data
|
|
|
|
Parquet SFT corpus (16k rows, `messages` column): [CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train](https://huggingface.co/datasets/CL-From-Nothing/RLVE-Eval20-Qwen3-1.7B-SSD-N20-SFT-Train).
|
|
|
|
## Load
|
|
|
|
```python
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
model_id = "CL-From-Nothing/Qwen3-1-7B-SSD-RLVE-Eval20-N20-global-step-500"
|
|
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="bfloat16", device_map="auto", trust_remote_code=True)
|
|
```
|
|
|
|
**Note:** Qwen3 requires `trust_remote_code=True`.
|