初始化项目,由ModelHub XC社区提供模型
Model: CL-From-Nothing/Qwen3-4B-SSD-RLVE-Eval20-N20-global-step-500 Source: Original Platform
This commit is contained in:
36
README.md
Normal file
36
README.md
Normal file
@@ -0,0 +1,36 @@
|
||||
---
|
||||
license: mit
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
base_model: Qwen/Qwen3-4B
|
||||
tags:
|
||||
- qwen3
|
||||
- ssd
|
||||
- self-distillation
|
||||
- rlve
|
||||
---
|
||||
|
||||
# Qwen3-4B SSD (RLVE Eval20, N=20) — global step 500
|
||||
|
||||
Weights merged from VERL FSDP SFT checkpoint **`global_step_500`** (500 optimizer steps, 1 epoch schedule) of
|
||||
**Simple Self-Distillation (SSD)** applied to **Qwen/Qwen3-4B**:
|
||||
sample N=20 self-generated responses from the frozen base model, then SFT on those samples.
|
||||
|
||||
## Training data
|
||||
|
||||
Parquet SFT corpus (16k rows, `messages` column):
|
||||
[CL-From-Nothing/RLVE-Eval20-Qwen3-4B-SSD-N20-SFT-Train](https://huggingface.co/datasets/CL-From-Nothing/RLVE-Eval20-Qwen3-4B-SSD-N20-SFT-Train).
|
||||
|
||||
Companion 1.7B model: [CL-From-Nothing/Qwen3-1-7B-SSD-RLVE-Eval20-N20-global-step-500](https://huggingface.co/CL-From-Nothing/Qwen3-1-7B-SSD-RLVE-Eval20-N20-global-step-500).
|
||||
|
||||
## Load
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
|
||||
model_id = "CL-From-Nothing/Qwen3-4B-SSD-RLVE-Eval20-N20-global-step-500"
|
||||
tok = AutoTokenizer.from_pretrained(model_id)
|
||||
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
|
||||
```
|
||||
Reference in New Issue
Block a user