初始化项目,由ModelHub XC社区提供模型

Model: NousResearch/Nous-Hermes-llama-2-7b
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-04 19:39:00 +08:00
commit c144166144
12 changed files with 93699 additions and 0 deletions

37
.gitattributes vendored Normal file
View File

@@ -0,0 +1,37 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
model.safetensors filter=lfs diff=lfs merge=lfs -text
pytorch_model.bin filter=lfs diff=lfs merge=lfs -text

158
README.md Normal file
View File

@@ -0,0 +1,158 @@
---
language:
- en
tags:
- llama-2
- self-instruct
- distillation
- synthetic instruction
license:
- mit
new_version: NousResearch/Hermes-3-Llama-3.1-8B
---
# Model Card: Nous-Hermes-Llama2-7b
Compute provided by our project sponsor Redmond AI, thank you! Follow RedmondAI on Twitter @RedmondAI.
## Model Description
Nous-Hermes-Llama2-7b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.
This Hermes model uses the exact same dataset as Hermes on Llama-1. This is to ensure consistency between the old Hermes and new, for anyone who wanted to keep Hermes as similar to the old one, just more capable.
This model stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 4096 sequence length on an 8x a100 80GB DGX machine.
## Model Training
The model was trained almost entirely on synthetic GPT-4 outputs. Curating high quality GPT-4 datasets enables incredibly high quality in knowledge, task completion, and style.
This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), and several others, detailed further below
## Collaborators
The model fine-tuning and the datasets were a collaboration of efforts and resources between Teknium, Karan4D, Emozilla, Huemin Art and Redmond AI.
Special mention goes to @winglian for assisting in some of the training issues.
Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly.
Among the contributors of datasets:
- GPTeacher was made available by Teknium
- Wizard LM by nlpxucan
- Nous Research Instruct Dataset was provided by Karan4D and HueminArt.
- GPT4-LLM and Unnatural Instructions were provided by Microsoft
- Airoboros dataset by jondurbin
- Camel-AI's domain expert datasets are from Camel-AI
- CodeAlpaca dataset by Sahil 2801.
If anyone was left out, please open a thread in the community tab.
## Prompt Format
The model follows the Alpaca prompt format:
```
### Instruction:
<prompt>
### Response:
<leave a newline blank for model to respond>
```
or
```
### Instruction:
<prompt>
### Input:
<additional context>
### Response:
<leave a newline blank for model to respond>
```
GPT4All:
```| Task |Version| Metric |Value | |Stderr|
|-------------|------:|--------|-----:|---|-----:|
|arc_challenge| 0|acc |0.4735|± |0.0146|
| | |acc_norm|0.5017|± |0.0146|
|arc_easy | 0|acc |0.7946|± |0.0083|
| | |acc_norm|0.7605|± |0.0088|
|boolq | 1|acc |0.8000|± |0.0070|
|hellaswag | 0|acc |0.5924|± |0.0049|
| | |acc_norm|0.7774|± |0.0042|
|openbookqa | 0|acc |0.3600|± |0.0215|
| | |acc_norm|0.4660|± |0.0223|
|piqa | 0|acc |0.7889|± |0.0095|
| | |acc_norm|0.7976|± |0.0094|
|winogrande | 0|acc |0.6993|± |0.0129|
Average: 0.686
```
BigBench:
```
| Task |Version| Metric |Value | |Stderr|
|------------------------------------------------|------:|---------------------|-----:|---|-----:|
|bigbench_causal_judgement | 0|multiple_choice_grade|0.5579|± |0.0361|
|bigbench_date_understanding | 0|multiple_choice_grade|0.6233|± |0.0253|
|bigbench_disambiguation_qa | 0|multiple_choice_grade|0.3062|± |0.0288|
|bigbench_geometric_shapes | 0|multiple_choice_grade|0.2006|± |0.0212|
| | |exact_str_match |0.0000|± |0.0000|
|bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|0.2540|± |0.0195|
|bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|0.1657|± |0.0141|
|bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|0.4067|± |0.0284|
|bigbench_movie_recommendation | 0|multiple_choice_grade|0.2780|± |0.0201|
|bigbench_navigate | 0|multiple_choice_grade|0.5000|± |0.0158|
|bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|0.4405|± |0.0111|
|bigbench_ruin_names | 0|multiple_choice_grade|0.2701|± |0.0210|
|bigbench_salient_translation_error_detection | 0|multiple_choice_grade|0.2034|± |0.0127|
|bigbench_snarks | 0|multiple_choice_grade|0.5028|± |0.0373|
|bigbench_sports_understanding | 0|multiple_choice_grade|0.6136|± |0.0155|
|bigbench_temporal_sequences | 0|multiple_choice_grade|0.2720|± |0.0141|
|bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|0.1944|± |0.0112|
|bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|0.1497|± |0.0085|
|bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|0.4067|± |0.0284|
Average: 0.3525
```
AGIEval
```
| Task |Version| Metric |Value | |Stderr|
|------------------------------|------:|--------|-----:|---|-----:|
|agieval_aqua_rat | 0|acc |0.2520|± |0.0273|
| | |acc_norm|0.2402|± |0.0269|
|agieval_logiqa_en | 0|acc |0.2796|± |0.0176|
| | |acc_norm|0.3241|± |0.0184|
|agieval_lsat_ar | 0|acc |0.2478|± |0.0285|
| | |acc_norm|0.2348|± |0.0280|
|agieval_lsat_lr | 0|acc |0.2843|± |0.0200|
| | |acc_norm|0.2765|± |0.0198|
|agieval_lsat_rc | 0|acc |0.3271|± |0.0287|
| | |acc_norm|0.3011|± |0.0280|
|agieval_sat_en | 0|acc |0.4660|± |0.0348|
| | |acc_norm|0.4223|± |0.0345|
|agieval_sat_en_without_passage| 0|acc |0.3738|± |0.0338|
| | |acc_norm|0.3447|± |0.0332|
|agieval_sat_math | 0|acc |0.2500|± |0.0293|
| | |acc_norm|0.2364|± |0.0287|
Average: 0.2975
```
## Benchmark Results
## Resources for Applied Use Cases:
For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord
For an example of a roleplaying discord chatbot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot
LM Studio is a good choice for a chat interface that supports GGML versions (to come)
## Future Plans
We plan to continue to iterate on both more high quality data, and new data filtering techniques to eliminate lower quality data going forward.
## Model Usage
The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.

3
added_tokens.json Normal file
View File

@@ -0,0 +1,3 @@
{
"<pad>": 32000
}

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "output/hermes-llama2-4k/checkpoint-2259",
"architectures": [
"LlamaForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 32,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.32.0.dev0",
"use_cache": false,
"vocab_size": 32000
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

9
generation_config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"pad_token_id": 0,
"temperature": 0.9,
"top_p": 0.6,
"transformers_version": "4.32.0.dev0"
}

3
model.safetensors Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:23d88a682b353e48ccedfcb63b9f58ffa634267d2e4311079fe1fa3e53161219
size 13476865232

3
pytorch_model.bin Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:fbf5845c2a382e91ccb844d39463d118b1edbb749dfcf87beee8c20e8cd49d3b
size 13476978469

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"pad_token": "<unk>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}

93400
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

32
tokenizer_config.json Normal file
View File

@@ -0,0 +1,32 @@
{
"bos_token": {
"__type": "AddedToken",
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"clean_up_tokenization_spaces": false,
"eos_token": {
"__type": "AddedToken",
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": null,
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": {
"__type": "AddedToken",
"content": "<unk>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
}
}