初始化项目，由ModelHub XC社区提供模型

Model: anakin87/Llama-3-8b-ita-ties-pro Source: Original Platform
2026-05-15 11:47:48 +08:00
commit b520f7dfc0
13 changed files with 412806 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,64 @@
+---
+base_model:
+- meta-llama/Meta-Llama-3-8B-Instruct
+- DeepMount00/Llama-3-8b-Ita
+- swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
+library_name: transformers
+tags:
+- mergekit
+- merge
+license: llama3
+language:
+- it
+---
+# Llama-3-8b-ita-ties-pro
+
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+
+I tried to merge two of the best Italian LLMs using Mergekit. The results are acceptable, but I could not improve on the best existing model.
+
+## Evaluation
+
+For a detailed comparison of model performance, check out the [Leaderboard for Italian Language Models](https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard).
+
+Here's a breakdown of the performance metrics:
+
+| Metric                      | hellaswag_it acc_norm | arc_it acc_norm | m_mmlu_it 5-shot acc | Average |
+|:----------------------------|:----------------------|:----------------|:---------------------|:--------|
+| **Accuracy Normalized**     | 0.6967                | 0.5646        | 0.5717              | 0.6110  |
+
+## Merge Details
+### Merge Method
+
+This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) as a base.
+
+### Models Merged
+
+The following models were included in the merge:
+* [DeepMount00/Llama-3-8b-Ita](https://huggingface.co/DeepMount00/Llama-3-8b-Ita)
+* [swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA)
+
+### Configuration
+
+The following YAML configuration was used to produce this model:
+
+```yaml
+
+models:
+  - model: meta-llama/Meta-Llama-3-8B-Instruct
+    # no parameters necessary for base model
+  - model: swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
+    parameters:
+      density: 0.7
+      weight: 0.6
+  - model: DeepMount00/Llama-3-8b-Ita
+    parameters:
+      density: 0.7
+      weight: 0.3
+merge_method: ties
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+parameters:
+  normalize: true
+dtype: bfloat16
+
+```