verl-math-transfer-llama31-…/README.md

---
library_name: transformers
pipeline_tag: text-generation
tags:
- verl
- math
- grpo
- transfer
- llama
- llama-3.1
- llama-3.2
- 8b
- 3b
- pool7to1
---

# VERL Math Transfer Llama 3.1 8B to Llama 3.2 3B pool7to1

Math transfer experiment trained with verl. This repo groups all exported Hugging Face checkpoints for the Llama 3.1 8B to Llama 3.2 3B pool7to1 configuration.

## Layout

- `main`: latest exported checkpoint, currently `step-080`
- step revisions: `step-010, step-020, step-030, step-040, step-050, step-060, step-070, step-080`

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "hyunseoki/verl-math-transfer-llama31-8b-to-llama32-3b-pool7to1"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True)
```

Load a specific checkpoint revision:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "hyunseoki/verl-math-transfer-llama31-8b-to-llama32-3b-pool7to1"
revision = "step-080"
tokenizer = AutoTokenizer.from_pretrained(repo_id, revision=revision, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(repo_id, revision=revision, trust_remote_code=True)
```

## Notes

- Architecture detected from the exported config: `LlamaForCausalLM`
- The original base model Hub ID is not encoded in these local checkpoints, so `base_model` metadata is not set automatically.
- Checkpoints were exported from verl FSDP shards into Hugging Face `safetensors` format.