初始化项目,由ModelHub XC社区提供模型
Model: hyunseoki/verl-math-transfer-7bi-to-3bi-fix07-pool7to1 Source: Original Platform
This commit is contained in:
49
README.md
Normal file
49
README.md
Normal file
@@ -0,0 +1,49 @@
|
||||
---
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- verl
|
||||
- math
|
||||
- grpo
|
||||
- transfer
|
||||
- qwen2
|
||||
- 3b
|
||||
- 7bi-to-3bi
|
||||
- pool7to1
|
||||
---
|
||||
|
||||
# VERL Math Transfer 7B to 3B fix07 pool7to1
|
||||
|
||||
Math transfer experiment trained with verl. This repo groups all exported Hugging Face checkpoints for the 7B-to-3B fix_0_7 pool7to1 configuration.
|
||||
|
||||
## Layout
|
||||
|
||||
- `main`: latest exported checkpoint, currently `step-070`
|
||||
- step revisions: `step-010, step-020, step-030, step-040, step-050, step-060, step-070`
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
repo_id = "hyunseoki/verl-math-transfer-7bi-to-3bi-fix07-pool7to1"
|
||||
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
|
||||
model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True)
|
||||
```
|
||||
|
||||
Load a specific checkpoint revision:
|
||||
|
||||
```python
|
||||
from transformers import AutoTokenizer, AutoModelForCausalLM
|
||||
|
||||
repo_id = "hyunseoki/verl-math-transfer-7bi-to-3bi-fix07-pool7to1"
|
||||
revision = "step-070"
|
||||
tokenizer = AutoTokenizer.from_pretrained(repo_id, revision=revision, trust_remote_code=True)
|
||||
model = AutoModelForCausalLM.from_pretrained(repo_id, revision=revision, trust_remote_code=True)
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- Architecture detected from the exported config: `Qwen2ForCausalLM`
|
||||
- The original base model Hub ID is not encoded in these local checkpoints, so `base_model` metadata is not set automatically.
|
||||
- Checkpoints were exported from verl FSDP shards into Hugging Face `safetensors` format.
|
||||
Reference in New Issue
Block a user