tulu3sft-normal-smollm-1p7b…/README.md

---
language: en
license: apache-2.0
tags:
  - smollm
  - llama
  - causal-lm
  - sft
  - tulu
model_type: llama
pipeline_tag: text-generation
---

# tulu3sft-normal-smollm-1p7b-500B-30n-2048sl-960gbsz

This is a supervised fine-tuned (SFT) checkpoint for a SmolLM2-style 1.7B model,
trained on the `allenai/tulu-3-sft-mixture` dataset. It is based on the 500B-token
pretrained base checkpoint and exported in Hugging Face `LlamaForCausalLM` format.

## Details

- Base model: `normal-smollm-1p7b-500B-30n-2048sl-960gbsz`
- SFT dataset: `allenai/tulu-3-sft-mixture`
- Context length: 2048
- Vocab size: 49152
- Architecture: Llama (RMSNorm, SwiGLU, RoPE)

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "REPLACE_WITH_OWNER/tulu3sft-normal-smollm-1p7b-500B-30n-2048sl-960gbsz"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
```

## Notes

This is an SFT model intended for chat-style use. For preference tuning, run DPO
on top of this checkpoint.