Files
Tito-7B-slerp/README.md
ModelHub XC e93fd63f90 初始化项目,由ModelHub XC社区提供模型
Model: Stopwolf/Tito-7B-slerp
Source: Original Platform
2026-05-12 23:40:26 +08:00

5.9 KiB

license, tags, model-index
license tags model-index
apache-2.0
merge
mergekit
lazymergekit
gordicaleksa/YugoGPT
mlabonne/AlphaMonarch-7B
name results
Tito-7B-slerp
task dataset metrics source
type name
text-generation Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot) ai2_arc ARC-Challenge test
num_few_shot
25
type value name
acc_norm 68.09 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type split args
HellaSwag (10-Shot) hellaswag validation
num_few_shot
10
type value name
acc_norm 86.38 normalized accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU (5-Shot) cais/mmlu all test
num_few_shot
5
type value name
acc 64.01 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
TruthfulQA (0-shot) truthful_qa multiple_choice validation
num_few_shot
0
type value
mc2 57.01
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
Winogrande (5-shot) winogrande winogrande_xl validation
num_few_shot
5
type value name
acc 81.69 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
GSM8k (5-shot) gsm8k main test
num_few_shot
5
type value name
acc 63.61 accuracy
url name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Stopwolf/Tito-7B-slerp Open LLM Leaderboard

Tito-7B-slerp

Tito-7B-slerp is a merge of the following models using mergekit:

🧩 Configuration

slices:
  - sources:
      - model: gordicaleksa/YugoGPT
        layer_range: [0, 32]
      - model: mlabonne/AlphaMonarch-7B
        layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/AlphaMonarch-7B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.6
dtype: bfloat16

Results

Evaluations on Serbian LLM eval suite (or rather, performance and knowledge of Serbian):

ARC-E ARC-C Hellaswag BoolQ Winogrande OpenbookQA PiQA NQ Open TriviaQA Avg.
Zamfir-7B 51.85 32.25 46.03 75.59 62.59 26.00 66.81 16.09 36.11 45.92
Mustra-7B 52.95 33.70 45.89 77.55 64.17 30.60 67.25 15.40 34.84 46.93
Tito-7B 55.43 34.73 48.19 77.37 65.27 30.00 67.30 16.7 35.38 47.82
YugoGPT 57.79 34.73 49.89 69.45 64.56 28.20 72.03 15.82 36.14 47.62

Here, all benchmarks were done 0-shot, on the exception of NQ Open and TriviaQA which were done in 5-shot manner, in order to be comparable to Mistral paper.

If we try to replicate OpenLLM Leaderboard results on available Serbian datasets (running an appropriate amount of shots instead of 0), we get:

ARC Hellaswag Winogrande TruthfulQA Avg.
Tito-7B 47.27 - 69.93 57.48 58.23
Perucac-7B 49.74 - 71.98 56.03 59.25
YugoGPT 44.03 - 70.64 48.06 54.24
Llama3-8B 42.24 - 61.25 51.08 51.52
SambaLingo 37.88 - 61.48 47.23 48.86

Note that YugoGPT, Llama3 and SambaLingo are all base models, unlike Tito and Perucac.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Tito YugoGPT
Avg. 70.13 57.34
AI2 Reasoning Challenge (25-Shot) 68.09 58.10
HellaSwag (10-Shot) 86.38 81.44
MMLU (5-Shot) 64.01 60.68
TruthfulQA (0-shot) 57.01 36.60
Winogrande (5-shot) 81.69 76.56
GSM8k (5-shot) 63.61 30.70