--- license: llama3 library_name: transformers base_model: - NousResearch/Hermes-2-Pro-Llama-3-8B - aaditya/Llama3-OpenBioLLM-8B - meta-llama/Meta-Llama-3-8B-Instruct base_model_relation: merge tags: - mindnlp - wizard - merge - dare_ties - llama - text-generation - arxiv:2311.03099 - arxiv:2306.01708 --- # Llama3-8B-merge-biomed-wizard (MindNLP Wizard Reproduction) ![Wizard](./Wizard.png) This is a DARE-TIES merge reproduction of Llama3-8B-Instruct + NousResearch/Hermes-2-Pro-Llama-3-8B + aaditya/Llama3-OpenBioLLM-8B. The overall merge recipe and benchmark setup follow [lighteternal/Llama3-merge-biomed-8b](https://huggingface.co/lighteternal/Llama3-merge-biomed-8b), while the actual merge implementation is performed with **MindNLP Wizard** on MindSpore/Ascend. ## Implementation Statement - Merge engine: **MindNLP Wizard** - Runtime stack: MindSpore + Ascend - Output dtype: `bfloat16` ## Usage Prompt template recommendation remains the Llama3 format: ## Leaderboard Metrics (Open LLM Leaderboard style) | Task | Metric | Ours (Wizard, %) | Llama3-8B-Instruct (%) | OpenBioLLM-8B (%) | | --- | --- | ---: | ---: | ---: | | **ARC Challenge** | Accuracy | **59.73** | 57.17 | 55.38 | | | Normalized Accuracy | **64.59** | 60.75 | 58.62 | | **HellaSwag** | Accuracy | 62.26 | **62.59** | 61.83 | | | Normalized Accuracy | 81.35 | **81.53** | 80.76 | | **Winogrande** | Accuracy | **76.01** | 74.51 | 70.88 | | **GSM8K** | Accuracy | **70.81** | 68.69 | 10.15 | | **MMLU-Anatomy** | Accuracy | 71.11 | **72.59** | 69.62 | | **MMLU-Clinical Knowledge** | Accuracy | 77.74 | **77.83** | 60.38 | | **MMLU-College Biology** | Accuracy | 80.56 | **81.94** | 79.86 | | **MMLU-College Medicine** | Accuracy | 68.21 | 63.58 | **70.52** | | **MMLU-Medical Genetics** | Accuracy | 82.00 | 80.00 | 80.00 | | **MMLU-Professional Medicine** | Accuracy | 77.57 | 71.69 | **77.94** | ## Merge Details ### Merge Method This model is merged using the **DARE-TIES** method with `meta-llama/Meta-Llama-3-8B-Instruct` as base. ### Models Merged The following donor models are included in the merge: - [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) - [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B) ### Configuration The following YAML configuration is used: ```yaml models: - model: meta-llama/Meta-Llama-3-8B-Instruct # Base model providing a general foundation without specific parameters - model: meta-llama/Meta-Llama-3-8B-Instruct parameters: density: 0.60 weight: 0.5 - model: NousResearch/Hermes-2-Pro-Llama-3-8B parameters: density: 0.55 weight: 0.1 - model: aaditya/Llama3-OpenBioLLM-8B parameters: density: 0.55 weight: 0.4 merge_method: dare_ties base_model: meta-llama/Meta-Llama-3-8B-Instruct parameters: int8_mask: true dtype: bfloat16 ``` ## Reproducibility Notes - Few-shot settings: - ARC Challenge: 25-shot - HellaSwag: 10-shot - Winogrande: 5-shot - GSM8K: 5-shot - MMLU-* subsets: 5-shot ## Environment (Inference / Evaluation) - Accelerator: Ascend 910B2 - MindSpore: 2.7.1 ## References - [Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch](https://arxiv.org/abs/2311.03099) - [Resolving Interference When Merging Models](https://arxiv.org/abs/2306.01708)