Files
epe-1p-smollm-1p7b-100B-20n…/README.md
ModelHub XC 0377caf6f0 初始化项目,由ModelHub XC社区提供模型
Model: Raghav-Singhal/epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz-no_bce-refl_end_doc
Source: Original Platform
2026-05-26 14:23:18 +08:00

33 lines
808 B
Markdown

---
library_name: transformers
pipeline_tag: text-generation
tags:
- llama
- causal-lm
- bfloat16
---
# epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz-no_bce-refl_end_doc
Converted Hugging Face base checkpoint from the Model Raising pretraining run.
## Details
- Architecture: `LlamaForCausalLM`
- Base model size: `1.7B`
- Precision on disk: `bfloat16`
- Source Megatron checkpoint iteration: `50863`
- Model kind: `epe`
- Config vocab size: `49280`
## Tokenizer
Use the bundled tokenizer from this repository.
This EPE checkpoint uses the extended SmolLM2 tokenizer with `<assistant>` and 35 `<charter_X.Y>` tokens. Two named chat templates are available:
| Name | Assistant turn start |
|------|----------------------|
| `default` | `<|im_start|>assistant\n` |
| `epe` | `<|im_start|><assistant>\n` |