初始化项目,由ModelHub XC社区提供模型

Model: Stopwolf/Tito-7B-slerp
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-12 23:40:26 +08:00
commit e93fd63f90
17 changed files with 91488 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

176
README.md Normal file
View File

@@ -0,0 +1,176 @@
---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- gordicaleksa/YugoGPT
- mlabonne/AlphaMonarch-7B
model-index:
- name: Tito-7B-slerp
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 68.09
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 86.38
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 64.01
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 57.01
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 81.69
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 63.61
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp
name: Open LLM Leaderboard
---
# Tito-7B-slerp
Tito-7B-slerp is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [gordicaleksa/YugoGPT](https://huggingface.co/gordicaleksa/YugoGPT)
* [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
## 🧩 Configuration
```yaml
slices:
- sources:
- model: gordicaleksa/YugoGPT
layer_range: [0, 32]
- model: mlabonne/AlphaMonarch-7B
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/AlphaMonarch-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.6
dtype: bfloat16
```
## Results
Evaluations on Serbian LLM eval suite (or rather, performance and knowledge of Serbian):
| | ARC-E | ARC-C | Hellaswag | BoolQ | Winogrande | OpenbookQA | PiQA | NQ Open | TriviaQA | Avg. |
|-----------|-------|-------|-----------|-------|------------|------------|-------|---------|----------|-------|
| [Zamfir-7B](https://huggingface.co/Stopwolf/Zamfir-7B-slerp) | 51.85 | 32.25 | 46.03 | 75.59 | 62.59 | 26.00 | 66.81 | 16.09 | 36.11 | 45.92 |
| [Mustra-7B](https://huggingface.co/Stopwolf/Mustra-7B-Instruct-v0.1) | 52.95 | 33.70 | 45.89 | **77.55** | 64.17 | **30.60** | 67.25 | 15.40 | 34.84 | 46.93 |
| [Tito-7B](https://huggingface.co/Stopwolf/Tito-7B-slerp) | 55.43 | **34.73** | 48.19 | 77.37 | **65.27** | 30.00 | 67.30 | **16.7** | 35.38 | **47.82** |
| [YugoGPT](https://huggingface.co/gordicaleksa/YugoGPT) | **57.79** | **34.73** | **49.89** | 69.45 | 64.56 | 28.20 | **72.03** | 15.82 | **36.14** | 47.62 |
Here, all benchmarks were done 0-shot, on the exception of NQ Open and TriviaQA which were done in 5-shot manner, in order to be comparable to Mistral paper.
If we try to replicate OpenLLM Leaderboard results on available Serbian datasets (running an appropriate amount of shots instead of 0), we get:
| | ARC | Hellaswag | Winogrande | TruthfulQA | Avg. |
|---------|-------|-----------|------------|------------|-------|
| Tito-7B | 47.27 | - | 69.93 | **57.48** | 58.23 |
| [Perucac-7B](https://huggingface.co/Stopwolf/Perucac-7B-slerp) | **49.74** | - | **71.98** | 56.03 | **59.25** |
| YugoGPT | 44.03 | - | 70.64 | 48.06 | 54.24 |
| Llama3-8B | 42.24 | - | 61.25 | 51.08 | 51.52 |
| SambaLingo | 37.88 | - | 61.48 | 47.23 | 48.86 |
Note that YugoGPT, Llama3 and SambaLingo are all base models, unlike Tito and Perucac.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Stopwolf__Tito-7B-slerp)
| Metric |Tito | YugoGPT |
|---------------------------------|----:|--------:|
|Avg. |70.13| 57.34 |
|AI2 Reasoning Challenge (25-Shot)|68.09| 58.10 |
|HellaSwag (10-Shot) |86.38| 81.44 |
|MMLU (5-Shot) |64.01| 60.68 |
|TruthfulQA (0-shot) |57.01| 36.60 |
|Winogrande (5-shot) |81.69| 76.56 |
|GSM8k (5-shot) |63.61| 30.70 |

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "mlabonne/AlphaMonarch-7B",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"sliding_window": 4096,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.38.1",
"use_cache": true,
"vocab_size": 32000
}

17
mergekit_config.yml Normal file
View File

@@ -0,0 +1,17 @@
slices:
- sources:
- model: gordicaleksa/YugoGPT
layer_range: [0, 32]
- model: mlabonne/AlphaMonarch-7B
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/AlphaMonarch-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.6
dtype: bfloat16

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:852e01cd44f9eaf08dd26dfbfc0732aef3a8219a3c32efaf0d5fe4079b4986c0
size 1979773128

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f2a4994900de91acfe4c0553cd77610bd4b0a6f6100c76f594ae169f9d547c75
size 1946235640

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c9147ed1abbd5475be47462e822e13e0a34dc294089ec6de6e731a085a39829c
size 1973490216

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aa62bb208ac52db37eab78570cb04c638e6cbde0d6de528f50e42d26c52b79c5
size 1979781464

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:e628c575ad5405d6e48916645c08749ce54f74ca80a179ae0a6a18c15c200fe0
size 1946243984

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:26b1af55d0309b2a1c79ffc9d1bf532d343447fee5d4b2c23b80fc1025245e25
size 1923166040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9c8b4eed2b322feec5c6c4fe74d67becbd67cf6c399c26fc62e657cd719abdee
size 1946243984

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8b495e827b22be885b475d633ba5bc6c04a6a4045edb1e7a168bb2dc16c377d7
size 788563544

File diff suppressed because one or more lines are too long

35
special_tokens_map.json Normal file
View File

@@ -0,0 +1,35 @@
{
"additional_special_tokens": [
"<unk>",
"<s>",
"</s>"
],
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

91122
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

49
tokenizer_config.json Normal file
View File

@@ -0,0 +1,49 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [
"<unk>",
"<s>",
"</s>"
],
"bos_token": "<s>",
"chat_template": "{% for message in messages %}{{bos_token + message['role'] + '\n' + message['content'] + eos_token + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ bos_token + 'assistant\n' }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 8192,
"pad_token": "</s>",
"padding_side": "left",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"split_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": true
}