Raghav-Singhal/epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz

Go to file

ModelHub XC 52d47de106 初始化项目，由ModelHub XC社区提供模型

Model: Raghav-Singhal/epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz
Source: Original Platform

2026-05-05 01:21:18 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

added_tokens.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-05 01:21:18 +08:00

README.md

library_name, pipeline_tag, tags

library_name

pipeline_tag

epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz

Converted Hugging Face base checkpoint from the Model Raising EPE (Ethics-by-Pretraining) pretraining run.

Details

Architecture: LlamaForCausalLM
Base model size: 1.7B
Precision on disk: bfloat16
Tokenizer: extended SmolLM2 tokenizer with 36 additional special tokens (<assistant> + 35 <charter_X.Y> tokens), vocab size 49280

EPE Pretraining

This model was pretrained with on-the-fly reflection insertion using the reflection_1p column from the annotated sidecar dataset. The training augments standard autoregressive NTP with:

Reflection insertion: reflections are inserted into annotated documents at inference time; the model predicts the reflection tokens using CE loss
Constitution predictor: a multi-label BCE loss at the <assistant> token position trains the model to predict 35 charter items
Attention masking: post-reflection context tokens are blocked from attending to the reflection region
Position aliasing: post-reflection context tokens alias back to the same RoPE positions as pre-reflection context, making inference (without reflections) positionally equivalent to training

Chat Templates

Two named chat templates are provided:

Name	Use case
`default`	Standard SFT — plain `assistant` role token
`epe`	Activates constitution head — uses `<assistant>` (token 49152) at start of assistant turns

tok.apply_chat_template(messages, chat_template="default")  # standard
tok.apply_chat_template(messages, chat_template="epe")      # constitution head active

SFT Notes

Always use the bundled tokenizer (vocab size 49280); the original SmolLM2 tokenizer (49152 tokens) will mismatch embeddings for IDs 49152–49279
vocab_size=49280 is set in config.json

README.md Unescape Escape

epe-1p-smollm-1p7b-100B-20n-2048sl-960gbsz

Details

EPE Pretraining

Chat Templates

SFT Notes

README.md