license, library_name, base_model, base_model_relation, tags
license library_name base_model base_model_relation tags
llama3 transformers
NousResearch/Hermes-2-Pro-Llama-3-8B
aaditya/Llama3-OpenBioLLM-8B
meta-llama/Meta-Llama-3-8B-Instruct
merge
mindnlp
wizard
merge
dare_ties
llama
text-generation
arxiv:2311.03099
arxiv:2306.01708

Llama3-8B-merge-biomed-wizard (MindNLP Wizard Reproduction)

Wizard

This is a DARE-TIES merge reproduction of Llama3-8B-Instruct + NousResearch/Hermes-2-Pro-Llama-3-8B + aaditya/Llama3-OpenBioLLM-8B.

The overall merge recipe and benchmark setup follow lighteternal/Llama3-merge-biomed-8b, while the actual merge implementation is performed with MindNLP Wizard on MindSpore/Ascend.

Implementation Statement

  • Merge engine: MindNLP Wizard
  • Runtime stack: MindSpore + Ascend
  • Output dtype: bfloat16

Usage

Prompt template recommendation remains the Llama3 format: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/

Leaderboard Metrics (Open LLM Leaderboard style)

Task Metric Ours (Wizard, %) Llama3-8B-Instruct (%) OpenBioLLM-8B (%)
ARC Challenge Accuracy 59.73 57.17 55.38
Normalized Accuracy 64.59 60.75 58.62
HellaSwag Accuracy 62.26 62.59 61.83
Normalized Accuracy 81.35 81.53 80.76
Winogrande Accuracy 76.01 74.51 70.88
GSM8K Accuracy 70.81 68.69 10.15
MMLU-Anatomy Accuracy 71.11 72.59 69.62
MMLU-Clinical Knowledge Accuracy 77.74 77.83 60.38
MMLU-College Biology Accuracy 80.56 81.94 79.86
MMLU-College Medicine Accuracy 68.21 63.58 70.52
MMLU-Medical Genetics Accuracy 82.00 80.00 80.00
MMLU-Professional Medicine Accuracy 77.57 71.69 77.94

Merge Details

Merge Method

This model is merged using the DARE-TIES method with meta-llama/Meta-Llama-3-8B-Instruct as base.

Models Merged

The following donor models are included in the merge:

Configuration

The following YAML configuration is used:

models:
  - model: meta-llama/Meta-Llama-3-8B-Instruct
    # Base model providing a general foundation without specific parameters

  - model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.60
      weight: 0.5

  - model: NousResearch/Hermes-2-Pro-Llama-3-8B
    parameters:
      density: 0.55
      weight: 0.1

  - model: aaditya/Llama3-OpenBioLLM-8B
    parameters:
      density: 0.55
      weight: 0.4

merge_method: dare_ties
base_model: meta-llama/Meta-Llama-3-8B-Instruct
parameters:
  int8_mask: true
dtype: bfloat16

Reproducibility Notes

  • Few-shot settings:
    • ARC Challenge: 25-shot
    • HellaSwag: 10-shot
    • Winogrande: 5-shot
    • GSM8K: 5-shot
    • MMLU-* subsets: 5-shot

Environment (Inference / Evaluation)

  • Accelerator: Ascend 910B2
  • MindSpore: 2.7.1

References

Description
Model synced from source: chenjingshen/Llama3-8B-merge-biomed-wizard
Readme 2.6 MiB