Files
llama-3-Nephilim-v3-8B/README.md
ModelHub XC 2322e8777c 初始化项目,由ModelHub XC社区提供模型
Model: grimjim/llama-3-Nephilim-v3-8B
Source: Original Platform
2026-06-07 21:21:21 +08:00

5.4 KiB

license, library_name, tags, base_model, pipeline_tag, model-index
license library_name tags base_model pipeline_tag model-index
cc-by-nc-4.0 transformers
mergekit
merge
grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge
tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1
text-generation
name results
llama-3-Nephilim-v3-8B
task dataset metrics source
type name
text-generation Text Generation
name type args
IFEval (0-Shot) HuggingFaceH4/ifeval
num_few_shot
0
type value name
inst_level_strict_acc and prompt_level_strict_acc 41.74 strict accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=grimjim/llama-3-Nephilim-v3-8B Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
BBH (3-Shot) BBH
num_few_shot
3
type value name
acc_norm 28.96 normalized accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=grimjim/llama-3-Nephilim-v3-8B Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
MATH Lvl 5 (4-Shot) hendrycks/competition_math
num_few_shot
4
type value name
exact_match 9.14 exact match
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=grimjim/llama-3-Nephilim-v3-8B Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
GPQA (0-shot) Idavidrein/gpqa
num_few_shot
0
type value name
acc_norm 6.04 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=grimjim/llama-3-Nephilim-v3-8B Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type args
MuSR (0-shot) TAUR-Lab/MuSR
num_few_shot
0
type value name
acc_norm 8.33 acc_norm
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=grimjim/llama-3-Nephilim-v3-8B Open LLM Leaderboard
task dataset metrics source
type name
text-generation Text Generation
name type config split args
MMLU-PRO (5-shot) TIGER-Lab/MMLU-Pro main test
num_few_shot
5
type value name
acc 29.02 accuracy
url name
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=grimjim/llama-3-Nephilim-v3-8B Open LLM Leaderboard

llama-3-Nephilim-v3-8B

This repo contains a merge of pre-trained language models created using mergekit.

GGUF quants are here.

Although none of the components of this merge were trained for roleplay nor intended for it, the model can be used effectively in that role.

Tested with temperature 1 and minP 0.01. This model leans toward being creative, so adjust temperature upward or downward as desired.

There are initial format consistency issues with the merged model, but this can be mitigated in an Instruct prompt. Additionally, promptsteering was employed to vary the text generation output to avoid some of the common failings observed during text generation with Llama 3 8B models. The complete Instruct prompt used during testing is available below.

Built with Meta Llama 3.

Merge Details

Merge Method

This model was merged using the task arithmetic merge method using grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

base_model: grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge
dtype: bfloat16
merge_method: task_arithmetic
parameters:
  normalize: false
slices:
- sources:
  - layer_range: [0, 32]
    model: grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge
  - layer_range: [0, 32]
    model: tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1
    parameters:
      weight: 0.1

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 20.54
IFEval (0-Shot) 41.74
BBH (3-Shot) 28.96
MATH Lvl 5 (4-Shot) 9.14
GPQA (0-shot) 6.04
MuSR (0-shot) 8.33
MMLU-PRO (5-shot) 29.02