Llama3-8B-merge-biomed-wizard/README.md

---
license: llama3
library_name: transformers
base_model:
  - NousResearch/Hermes-2-Pro-Llama-3-8B
  - aaditya/Llama3-OpenBioLLM-8B
  - meta-llama/Meta-Llama-3-8B-Instruct
base_model_relation: merge
tags:
  - mindnlp
  - wizard
  - merge
  - dare_ties
  - llama
  - text-generation
  - arxiv:2311.03099
  - arxiv:2306.01708
---

# Llama3-8B-merge-biomed-wizard (MindNLP Wizard Reproduction)

![Wizard](./Wizard.png)

This is a DARE-TIES merge reproduction of Llama3-8B-Instruct + NousResearch/Hermes-2-Pro-Llama-3-8B + aaditya/Llama3-OpenBioLLM-8B.

The overall merge recipe and benchmark setup follow [lighteternal/Llama3-merge-biomed-8b](https://huggingface.co/lighteternal/Llama3-merge-biomed-8b), while the actual merge implementation is performed with **MindNLP Wizard** on MindSpore/Ascend.

## Implementation Statement

- Merge engine: **MindNLP Wizard**
- Runtime stack: MindSpore + Ascend
- Output dtype: `bfloat16`

## Usage

Prompt template recommendation remains the Llama3 format:
<https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/>

## Leaderboard Metrics (Open LLM Leaderboard style)

| Task | Metric | Ours (Wizard, %) | Llama3-8B-Instruct (%) | OpenBioLLM-8B (%) |
| --- | --- | ---: | ---: | ---: |
| **ARC Challenge** | Accuracy | **59.73** | 57.17 | 55.38 |
|  | Normalized Accuracy | **64.59** | 60.75 | 58.62 |
| **HellaSwag** | Accuracy | 62.26 | **62.59** | 61.83 |
|  | Normalized Accuracy | 81.35 | **81.53** | 80.76 |
| **Winogrande** | Accuracy | **76.01** | 74.51 | 70.88 |
| **GSM8K** | Accuracy | **70.81** | 68.69 | 10.15 |
| **MMLU-Anatomy** | Accuracy | 71.11 | **72.59** | 69.62 |
| **MMLU-Clinical Knowledge** | Accuracy | 77.74 | **77.83** | 60.38 |
| **MMLU-College Biology** | Accuracy | 80.56 | **81.94** | 79.86 |
| **MMLU-College Medicine** | Accuracy | 68.21 | 63.58 | **70.52** |
| **MMLU-Medical Genetics** | Accuracy | 82.00 | 80.00 | 80.00 |
| **MMLU-Professional Medicine** | Accuracy | 77.57 | 71.69 | **77.94** |

## Merge Details

### Merge Method

This model is merged using the **DARE-TIES** method with `meta-llama/Meta-Llama-3-8B-Instruct` as base.

### Models Merged

The following donor models are included in the merge:

- [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
- [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B)

### Configuration

The following YAML configuration is used:

```yaml
models:
  - model: meta-llama/Meta-Llama-3-8B-Instruct
    # Base model providing a general foundation without specific parameters

  - model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.60
      weight: 0.5

  - model: NousResearch/Hermes-2-Pro-Llama-3-8B
    parameters:
      density: 0.55
      weight: 0.1

  - model: aaditya/Llama3-OpenBioLLM-8B
    parameters:
      density: 0.55
      weight: 0.4

merge_method: dare_ties
base_model: meta-llama/Meta-Llama-3-8B-Instruct
parameters:
  int8_mask: true
dtype: bfloat16
```

## Reproducibility Notes

- Few-shot settings:
  - ARC Challenge: 25-shot
  - HellaSwag: 10-shot
  - Winogrande: 5-shot
  - GSM8K: 5-shot
  - MMLU-* subsets: 5-shot

## Environment (Inference / Evaluation)

- Accelerator: Ascend 910B2
- MindSpore: 2.7.1

## References

- [Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch](https://arxiv.org/abs/2311.03099)
- [Resolving Interference When Merging Models](https://arxiv.org/abs/2306.01708)