Model: Raghav-Singhal/epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz-no_bce-refl_end_doc Source: Original Platform
library_name, pipeline_tag, tags
| library_name | pipeline_tag | tags | |||
|---|---|---|---|---|---|
| transformers | text-generation |
|
epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz-no_bce-refl_end_doc
Converted Hugging Face base checkpoint from the Model Raising pretraining run.
Details
- Architecture:
LlamaForCausalLM - Base model size:
1.7B - Precision on disk:
bfloat16 - Source Megatron checkpoint iteration:
50863 - Model kind:
epe - Config vocab size:
49280
Tokenizer
Use the bundled tokenizer from this repository.
This EPE checkpoint uses the extended SmolLM2 tokenizer with <assistant> and 35 <charter_X.Y> tokens. Two named chat templates are available:
| Name | Assistant turn start |
|---|---|
default |
`< |
epe |
`< |
Description
Model synced from source: Raghav-Singhal/epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz-no_bce-refl_end_doc
Languages
Text
100%