初始化项目，由ModelHub XC社区提供模型

Model: anakin87/Llama-3-8b-ita-slerp Source: Original Platform
2026-05-16 19:53:59 +08:00
commit f56e0a0676
13 changed files with 412774 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,77 @@
+---
+base_model:
+- swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
+- DeepMount00/Llama-3-8b-Ita
+library_name: transformers
+tags:
+- mergekit
+- merge
+license: llama3
+language:
+- it
+---
+# Llama-3-8b-ita-slerp
+
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+
+I tried to merge two of the best Italian LLMs using Mergekit. The results are acceptable, but I could not improve on the best existing model.
+
+## Evaluation
+
+For a detailed comparison of model performance, check out the [Leaderboard for Italian Language Models](https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard).
+
+Here's a breakdown of the performance metrics:
+
+| Metric                      | hellaswag_it acc_norm | arc_it acc_norm | m_mmlu_it 5-shot acc | Average |
+|:----------------------------|:----------------------|:----------------|:---------------------|:--------|
+| **Accuracy Normalized**     | 0.6879               | 0.5714        | 0.5732              | 0.6109  |
+
+## Merge Details
+### Merge Method
+
+This model was merged using the SLERP merge method.
+
+### Models Merged
+
+The following models were included in the merge:
+* [swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA)
+* [DeepMount00/Llama-3-8b-Ita](https://huggingface.co/DeepMount00/Llama-3-8b-Ita)
+
+### Configuration
+
+The following YAML configuration was used to produce this model:
+
+```yaml
+
+slices:
+- sources:
+  - model: swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
+    layer_range:
+    - 0
+    - 32
+  - model: DeepMount00/Llama-3-8b-Ita
+    layer_range:
+    - 0
+    - 32
+merge_method: slerp
+base_model: swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
+parameters:
+  t:
+  - filter: self_attn
+    value:
+    - 0
+    - 0.5
+    - 0.3
+    - 0.7
+    - 1
+  - filter: mlp
+    value:
+    - 1
+    - 0.5
+    - 0.7
+    - 0.3
+    - 0
+  - value: 0.5
+dtype: bfloat16
+
+```