初始化项目,由ModelHub XC社区提供模型

Model: habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-23 15:31:42 +08:00
commit 2cbb604314
16 changed files with 96124 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

170
README.md Normal file
View File

@@ -0,0 +1,170 @@
---
language:
- en
license: apache-2.0
datasets:
- OpenAssistant/oasst_top1_2023-08-25
pipeline_tag: text-generation
model-index:
- name: tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 32.85
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 58.16
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 25.96
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 38.35
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 57.7
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 0.45
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=habanoz/tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1
name: Open LLM Leaderboard
---
TinyLlama-1.1B-intermediate-step-715k-1.5T finetuned using OpenAssistant/oasst_top1_2023-08-25 dataset.
SFT code:
https://github.com/jzhang38/TinyLlama/tree/main/sft
Evaluation Results at:
https://huggingface.co/datasets/open-llm-leaderboard/details_habanoz__tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1_public/blob/main/results_2023-11-23T17-25-53.937618.json
Command used:
```bash
accelerate launch finetune.py \
--model_name_or_path TinyLlama/TinyLlama-1.1B-intermediate-step-715k-1.5T \
--output_dir ./output/1_5T_FT_lr1e-5_ep5_top1_2023-08-25 \
--logging_steps 10 \
--save_strategy epoch \
--data_seed 42 \
--save_total_limit 2 \
--evaluation_strategy epoch \
--eval_dataset_size 512 \
--max_eval_samples 1000 \
--per_device_eval_batch_size 1 \
--max_new_tokens 32 \
--dataloader_num_workers 3 \
--group_by_length=False \
--logging_strategy steps \
--remove_unused_columns False \
--do_train \
--do_eval \
--warmup_ratio 0.05 \
--lr_scheduler_type constant \
--dataset OpenAssistant/oasst_top1_2023-08-25 \
--dataset_format oasst1 \
--source_max_len 1 \
--target_max_len 1023 \
--per_device_train_batch_size 2 \
--gradient_accumulation_steps 8 \
--max_steps 0 \
--num_train_epochs 5 \
--learning_rate 1e-5 \
--adam_beta2 0.999 \
--max_grad_norm 1.0 \
--weight_decay 0.0 \
--seed 0 \
--trust_remote_code
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_habanoz__tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1)
| Metric |Value|
|---------------------------------|----:|
|Avg. |35.58|
|AI2 Reasoning Challenge (25-Shot)|32.85|
|HellaSwag (10-Shot) |58.16|
|MMLU (5-Shot) |25.96|
|TruthfulQA (0-shot) |38.35|
|Winogrande (5-shot) |57.70|
|GSM8k (5-shot) | 0.45|

5
added_tokens.json Normal file
View File

@@ -0,0 +1,5 @@
{
"<|im_end|>": 32002,
"<|im_start|>": 32001,
"[PAD]": 32000
}

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "TinyLlama/TinyLlama-1.1B-intermediate-step-715k-1.5T",
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 2048,
"initializer_range": 0.02,
"intermediate_size": 5632,
"max_position_embeddings": 2048,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 22,
"num_key_value_heads": 4,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.31.0",
"use_cache": false,
"vocab_size": 32003
}

4
generation_config.json Normal file
View File

@@ -0,0 +1,4 @@
{
"max_new_tokens": 32,
"transformers_version": "4.31.0"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5d7e671102d45cfccf95c0e6358634a03f128490d8bb408ce40cad67ee1ef900
size 4400271040

3
optimizer.pt Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3dff1c1ccc2f843f37b98e9ef2d3201432678e0fcfba462f61719ed925dddc96
size 8800658482

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:98656cdc3cb658e5c9c614b7a6a8373c39a82a73addd5f1c28cadbce65a9ff5c
size 4400320906

3
rng_state.pth Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:510b5bea665e30badb396a842fd0ce2871abed1251840cc1bd70ee16b88c258a
size 14180

3
scheduler.pt Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ee523aaebf9a5639f449d144ff1e51abd768a2ec6e88d02e4c8c1649820d06fe
size 1064

28
special_tokens_map.json Normal file
View File

@@ -0,0 +1,28 @@
{
"additional_special_tokens": [
"<|im_start|>",
"<|im_end|>"
],
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "[PAD]",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

93418
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

33
tokenizer_config.json Normal file
View File

@@ -0,0 +1,33 @@
{
"bos_token": {
"__type": "AddedToken",
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"clean_up_tokenization_spaces": false,
"eos_token": {
"__type": "AddedToken",
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"padding_side": "right",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": {
"__type": "AddedToken",
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

2384
trainer_state.json Normal file

File diff suppressed because it is too large Load Diff

3
training_args.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:09840659b95c826a5ecfd883555afc96f5502a9ef6dc46e27ec0b7cadf65f389
size 5880