library_name, pipeline_tag, tags
library_name pipeline_tag tags
transformers text-generation
llama
causal-lm
bfloat16

epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz

Converted Hugging Face base checkpoint from the Model Raising EPE (Ethics-by-Pretraining) pretraining run.

Details

  • Architecture: LlamaForCausalLM
  • Base model size: 1.7B
  • Precision on disk: bfloat16
  • Tokenizer: extended SmolLM2 tokenizer with 36 additional special tokens (<assistant> + 35 <charter_X.Y> tokens), vocab size 49280

EPE Pretraining

This model was pretrained with on-the-fly reflection insertion using the reflection_1p column from the annotated sidecar dataset. The training augments standard autoregressive NTP with:

  1. Reflection insertion: reflections are inserted into annotated documents at inference time; the model predicts the reflection tokens using CE loss
  2. Constitution predictor: a multi-label BCE loss at the <assistant> token position trains the model to predict 35 charter items
  3. Attention masking: post-reflection context tokens are blocked from attending to the reflection region
  4. Position aliasing: post-reflection context tokens alias back to the same RoPE positions as pre-reflection context, making inference (without reflections) positionally equivalent to training

Chat Templates

Two named chat templates are provided:

Name Use case
default Standard SFT — plain assistant role token
epe Activates constitution head — uses <assistant> (token 49152) at start of assistant turns
tok.apply_chat_template(messages, chat_template="default")  # standard
tok.apply_chat_template(messages, chat_template="epe")      # constitution head active

SFT Notes

  • Always use the bundled tokenizer (vocab size 49280); the original SmolLM2 tokenizer (49152 tokens) will mismatch embeddings for IDs 4915249279
  • vocab_size=49280 is set in config.json
Description
Model synced from source: Raghav-Singhal/epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz
Readme 1.3 MiB
Languages
Text 100%