--- library_name: transformers pipeline_tag: text-generation tags: - llama - causal-lm - bfloat16 --- # epe-3p-smollm-1p7b-100B-20n-2048sl-960gbsz Converted Hugging Face base checkpoint from the Model Raising EPE (Ethics-by-Pretraining) pretraining run. ## Details - Architecture: `LlamaForCausalLM` - Base model size: `1.7B` - Precision on disk: `bfloat16` - Tokenizer: extended SmolLM2 tokenizer with 36 additional special tokens (`` + 35 `` tokens), vocab size 49280 ## EPE Pretraining This model was pretrained with on-the-fly reflection insertion using the `reflection_3p` column from the annotated sidecar dataset. The training augments standard autoregressive NTP with: 1. **Reflection insertion**: reflections are inserted into annotated documents at inference time; the model predicts the reflection tokens using CE loss 2. **Constitution predictor**: a multi-label BCE loss at the `` token position trains the model to predict 35 charter items 3. **Attention masking**: post-reflection context tokens are blocked from attending to the reflection region 4. **Position aliasing**: post-reflection context tokens alias back to the same RoPE positions as pre-reflection context, making inference (without reflections) positionally equivalent to training ## Chat Templates Two named chat templates are provided: | Name | Use case | |------|----------| | `default` | Standard SFT — plain `assistant` role token | | `epe` | Activates constitution head — uses `` (token 49152) at start of assistant turns | ```python tok.apply_chat_template(messages, chat_template="default") # standard tok.apply_chat_template(messages, chat_template="epe") # constitution head active ``` ## SFT Notes - Always use the bundled tokenizer (vocab size 49280); the original SmolLM2 tokenizer (49152 tokens) will mismatch embeddings for IDs 49152–49279 - `vocab_size=49280` is set in `config.json`