TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

Go to file

ModelHub XC f447788bb8 初始化项目，由ModelHub XC社区提供模型

Model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
Source: Original Platform

2026-05-17 21:49:35 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

pytorch_model.bin

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

tokenizer.model

初始化项目，由ModelHub XC社区提供模型

2026-05-17 21:49:35 +08:00

README.md

language, license, datasets, model-index

language

license

datasets

model-index

apache-2.0

cerebras/SlimPajama-627B

bigcode/starcoderdata

name

results

TinyLlama-1.1B-intermediate-step-1431k-3T

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

AI2 Reasoning Challenge (25-Shot)

ai2_arc

ARC-Challenge

test

num_few_shot
25

type	value	name
acc_norm	33.87	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

split

args

HellaSwag (10-Shot)

hellaswag

validation

num_few_shot
10

type	value	name
acc_norm	60.31	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU (5-Shot)

cais/mmlu

all

test

num_few_shot
5

type	value	name
acc	26.04	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

TruthfulQA (0-shot)

truthful_qa

multiple_choice

validation

num_few_shot
0

type	value
mc2	37.32

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

Winogrande (5-shot)

winogrande

winogrande_xl

validation

num_few_shot
5

type	value	name
acc	59.51	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

GSM8k (5-shot)

gsm8k

main

test

num_few_shot
5

type	value	name
acc	1.44	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T	Open LLM Leaderboard

TinyLlama-1.1B

https://github.com/jzhang38/TinyLlama

The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.

We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.

This Collection

This collection contains all checkpoints after the 1T fix. Branch name indicates the step and number of tokens seen.

Eval

Model	Pretrain Tokens	HellaSwag	Obqa	WinoGrande	ARC_c	ARC_e	boolq	piqa	avg
Pythia-1.0B	300B	47.16	31.40	53.43	27.05	48.99	60.83	69.21	48.30
TinyLlama-1.1B-intermediate-step-50K-104b	103B	43.50	29.80	53.28	24.32	44.91	59.66	67.30	46.11
TinyLlama-1.1B-intermediate-step-240k-503b	503B	49.56	31.40	55.80	26.54	48.32	56.91	69.42	48.28
TinyLlama-1.1B-intermediate-step-480k-1007B	1007B	52.54	33.40	55.96	27.82	52.36	59.54	69.91	50.22
TinyLlama-1.1B-intermediate-step-715k-1.5T	1.5T	53.68	35.20	58.33	29.18	51.89	59.08	71.65	51.29
TinyLlama-1.1B-intermediate-step-955k-2T	2T	54.63	33.40	56.83	28.07	54.67	63.21	70.67	51.64
TinyLlama-1.1B-intermediate-step-1195k-2.5T	2.5T	58.96	34.40	58.72	31.91	56.78	63.21	73.07	53.86
TinyLlama-1.1B-intermediate-step-1431k-3T	3T	59.20	36.00	59.12	30.12	55.25	57.83	73.29	52.99

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	36.42
AI2 Reasoning Challenge (25-Shot)	33.87
HellaSwag (10-Shot)	60.31
MMLU (5-Shot)	26.04
TruthfulQA (0-shot)	37.32
Winogrande (5-shot)	59.51
GSM8k (5-shot)	1.44