初始化项目,由ModelHub XC社区提供模型

Model: Dampfinchen/Llama-3.1-8B-Ultra-Instruct
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-18 04:06:09 +08:00
commit 84880a9c18
13 changed files with 412927 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

167
README.md Normal file
View File

@@ -0,0 +1,167 @@
---
license: llama3
library_name: transformers
tags:
- mergekit
- merge
base_model:
- nbeerbower/llama3.1-gutenberg-8B
- akjindal53244/Llama-3.1-Storm-8B
- NousResearch/Meta-Llama-3.1-8B
- nbeerbower/llama3.1-airoboros3.2-QDT-8B
- Sao10K/Llama-3.1-8B-Stheno-v3.4
model-index:
- name: Llama-3.1-8B-Ultra-Instruct
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 80.81
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Dampfinchen/Llama-3.1-8B-Ultra-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 32.49
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Dampfinchen/Llama-3.1-8B-Ultra-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 14.95
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Dampfinchen/Llama-3.1-8B-Ultra-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 5.59
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Dampfinchen/Llama-3.1-8B-Ultra-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 8.61
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Dampfinchen/Llama-3.1-8B-Ultra-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 31.4
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Dampfinchen/Llama-3.1-8B-Ultra-Instruct
name: Open LLM Leaderboard
---
# merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3.1-8B](https://huggingface.co/NousResearch/Meta-Llama-3.1-8B) as a base.
### Models Merged
The following models were included in the merge:
* [nbeerbower/llama3.1-gutenberg-8B](https://huggingface.co/nbeerbower/llama3.1-gutenberg-8B)
* [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
* [nbeerbower/llama3.1-airoboros3.2-QDT-8B](https://huggingface.co/nbeerbower/llama3.1-airoboros3.2-QDT-8B)
* [Sao10K/Llama-3.1-8B-Stheno-v3.4](https://huggingface.co/Sao10K/Llama-3.1-8B-Stheno-v3.4)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: Sao10K/Llama-3.1-8B-Stheno-v3.4
parameters:
weight: 0.2
density: 0.5
- model: akjindal53244/Llama-3.1-Storm-8B
parameters:
weight: 0.5
density: 0.5
- model: nbeerbower/llama3.1-gutenberg-8B
parameters:
weight: 0.3
density: 0.5
- model: nbeerbower/llama3.1-airoboros3.2-QDT-8B
parameters:
weight: 0.2
density: 0.5
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3.1-8B
dtype: bfloat16
name: Llama-3.1-8B-Ultra-Instruct
```
Use Llama 3 Instruct prompt template. Use with caution, I'm not responsible for what you do with it. All credits and thanks go to the creators of the fine tunes I've merged. In my own tests and on HF Eval it performs very well for a 8B model and I can recommend it. High quality quants by Bartowski: https://huggingface.co/bartowski/Llama-3.1-8B-Ultra-Instruct-GGUF
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Dampfinchen__Llama-3.1-8B-Ultra-Instruct)
| Metric |Value|
|-------------------|----:|
|Avg. |28.98|
|IFEval (0-Shot) |80.81|
|BBH (3-Shot) |32.49|
|MATH Lvl 5 (4-Shot)|14.95|
|GPQA (0-shot) | 5.59|
|MuSR (0-shot) | 8.61|
|MMLU-PRO (5-shot) |31.40|

38
config.json Normal file
View File

@@ -0,0 +1,38 @@
{
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": [
128001,
128008,
128009
],
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 131072,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 8.0,
"low_freq_factor": 1.0,
"high_freq_factor": 4.0,
"original_max_position_embeddings": 8192,
"rope_type": "llama3"
},
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.42.3",
"use_cache": true,
"vocab_size": 128256
}

12
generation_config.json Normal file
View File

@@ -0,0 +1,12 @@
{
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128008,
128009
],
"temperature": 0.6,
"top_p": 0.9,
"transformers_version": "4.42.3"
}

21
mergekit_config.yml Normal file
View File

@@ -0,0 +1,21 @@
models:
- model: Sao10K/Llama-3.1-8B-Stheno-v3.4
parameters:
weight: 0.2
density: 0.5
- model: akjindal53244/Llama-3.1-Storm-8B
parameters:
weight: 0.5
density: 0.5
- model: nbeerbower/llama3.1-gutenberg-8B
parameters:
weight: 0.3
density: 0.5
- model: nbeerbower/llama3.1-airoboros3.2-QDT-8B
parameters:
weight: 0.2
density: 0.5
merge_method: dare_ties
base_model: NousResearch/Meta-Llama-3.1-8B
dtype: bfloat16
name: Llama-3.1-8B-Ultra-Instruct

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7fedd9bb5c4fd8624f7e8f3c7b751864f98399c54e25d2ade5bd1108728444ea
size 4953586384

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:78429beb9e407ff1fba9a2f9c71ced1b788d8f90cd7ab38def5de7da5fde5ee2
size 4999819336

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2627395f2bdd92b4f5ea1ecf466c942b31a5bc10567a94dfade1e9d09365022f
size 4915916144

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:384a22db6ef7f3306d1da78659a23e970df6d68e99210a0281fae0d1969a1a60
size 1191234472

File diff suppressed because one or more lines are too long

16
special_tokens_map.json Normal file
View File

@@ -0,0 +1,16 @@
{
"bos_token": {
"content": "<|begin_of_text|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|eot_id|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

410563
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

2062
tokenizer_config.json Normal file

File diff suppressed because it is too large Load Diff