e93fd63f904c9b56b98935a4ec76c1df6fb11bc8
Model: Stopwolf/Tito-7B-slerp Source: Original Platform
license, tags, model-index
| license | tags | model-index | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| apache-2.0 |
|
|
Tito-7B-slerp
Tito-7B-slerp is a merge of the following models using mergekit:
🧩 Configuration
slices:
- sources:
- model: gordicaleksa/YugoGPT
layer_range: [0, 32]
- model: mlabonne/AlphaMonarch-7B
layer_range: [0, 32]
merge_method: slerp
base_model: mlabonne/AlphaMonarch-7B
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.6
dtype: bfloat16
Results
Evaluations on Serbian LLM eval suite (or rather, performance and knowledge of Serbian):
| ARC-E | ARC-C | Hellaswag | BoolQ | Winogrande | OpenbookQA | PiQA | NQ Open | TriviaQA | Avg. | |
|---|---|---|---|---|---|---|---|---|---|---|
| Zamfir-7B | 51.85 | 32.25 | 46.03 | 75.59 | 62.59 | 26.00 | 66.81 | 16.09 | 36.11 | 45.92 |
| Mustra-7B | 52.95 | 33.70 | 45.89 | 77.55 | 64.17 | 30.60 | 67.25 | 15.40 | 34.84 | 46.93 |
| Tito-7B | 55.43 | 34.73 | 48.19 | 77.37 | 65.27 | 30.00 | 67.30 | 16.7 | 35.38 | 47.82 |
| YugoGPT | 57.79 | 34.73 | 49.89 | 69.45 | 64.56 | 28.20 | 72.03 | 15.82 | 36.14 | 47.62 |
Here, all benchmarks were done 0-shot, on the exception of NQ Open and TriviaQA which were done in 5-shot manner, in order to be comparable to Mistral paper.
If we try to replicate OpenLLM Leaderboard results on available Serbian datasets (running an appropriate amount of shots instead of 0), we get:
| ARC | Hellaswag | Winogrande | TruthfulQA | Avg. | |
|---|---|---|---|---|---|
| Tito-7B | 47.27 | - | 69.93 | 57.48 | 58.23 |
| Perucac-7B | 49.74 | - | 71.98 | 56.03 | 59.25 |
| YugoGPT | 44.03 | - | 70.64 | 48.06 | 54.24 |
| Llama3-8B | 42.24 | - | 61.25 | 51.08 | 51.52 |
| SambaLingo | 37.88 | - | 61.48 | 47.23 | 48.86 |
Note that YugoGPT, Llama3 and SambaLingo are all base models, unlike Tito and Perucac.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Tito | YugoGPT |
|---|---|---|
| Avg. | 70.13 | 57.34 |
| AI2 Reasoning Challenge (25-Shot) | 68.09 | 58.10 |
| HellaSwag (10-Shot) | 86.38 | 81.44 |
| MMLU (5-Shot) | 64.01 | 60.68 |
| TruthfulQA (0-shot) | 57.01 | 36.60 |
| Winogrande (5-shot) | 81.69 | 76.56 |
| GSM8k (5-shot) | 63.61 | 30.70 |
Description