初始化项目，由ModelHub XC社区提供模型

Model: NousResearch/Nous-Hermes-llama-2-7b Source: Original Platform
2026-05-04 19:39:00 +08:00
commit c144166144
12 changed files with 93699 additions and 0 deletions
--- a/.gitattributes
+++ b/.gitattributes
@@ -0,0 +1,37 @@
 *.7z filter=lfs diff=lfs merge=lfs -text
 *.arrow filter=lfs diff=lfs merge=lfs -text
 *.bin filter=lfs diff=lfs merge=lfs -text
 *.bz2 filter=lfs diff=lfs merge=lfs -text
 *.ckpt filter=lfs diff=lfs merge=lfs -text
 *.ftz filter=lfs diff=lfs merge=lfs -text
 *.gz filter=lfs diff=lfs merge=lfs -text
 *.h5 filter=lfs diff=lfs merge=lfs -text
 *.joblib filter=lfs diff=lfs merge=lfs -text
 *.lfs.* filter=lfs diff=lfs merge=lfs -text
 *.mlmodel filter=lfs diff=lfs merge=lfs -text
 *.model filter=lfs diff=lfs merge=lfs -text
 *.msgpack filter=lfs diff=lfs merge=lfs -text
 *.npy filter=lfs diff=lfs merge=lfs -text
 *.npz filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.parquet filter=lfs diff=lfs merge=lfs -text
 *.pb filter=lfs diff=lfs merge=lfs -text
 *.pickle filter=lfs diff=lfs merge=lfs -text
 *.pkl filter=lfs diff=lfs merge=lfs -text
 *.pt filter=lfs diff=lfs merge=lfs -text
 *.pth filter=lfs diff=lfs merge=lfs -text
 *.rar filter=lfs diff=lfs merge=lfs -text
 *.safetensors filter=lfs diff=lfs merge=lfs -text
 saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.tar.* filter=lfs diff=lfs merge=lfs -text
 *.tar filter=lfs diff=lfs merge=lfs -text
 *.tflite filter=lfs diff=lfs merge=lfs -text
 *.tgz filter=lfs diff=lfs merge=lfs -text
 *.wasm filter=lfs diff=lfs merge=lfs -text
 *.xz filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 model.safetensors filter=lfs diff=lfs merge=lfs -text
 pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@@ -0,0 +1,158 @@
 ---
 language:
 - en
 tags:
 - llama-2
 - self-instruct
 - distillation
 - synthetic instruction
 license:
 - mit
 new_version: NousResearch/Hermes-3-Llama-3.1-8B
 ---
 # Model Card: Nous-Hermes-Llama2-7b
 Compute provided by our project sponsor Redmond AI, thank you! Follow RedmondAI on Twitter @RedmondAI.
 ## Model Description
 Nous-Hermes-Llama2-7b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.
 This Hermes model uses the exact same dataset as Hermes on Llama-1. This is to ensure consistency between the old Hermes and new, for anyone who wanted to keep Hermes as similar to the old one, just more capable.
 This model stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 4096 sequence length on an 8x a100 80GB DGX machine.
 ## Model Training
 The model was trained almost entirely on synthetic GPT-4 outputs. Curating high quality GPT-4 datasets enables incredibly high quality in knowledge, task completion, and style.
 This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), and several others, detailed further below
 ## Collaborators
 The model fine-tuning and the datasets were a collaboration of efforts and resources between Teknium, Karan4D, Emozilla, Huemin Art and Redmond AI. 
 Special mention goes to @winglian for assisting in some of the training issues.
 Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly. 
 Among the contributors of datasets:
 - GPTeacher was made available by Teknium
 - Wizard LM by nlpxucan
 - Nous Research Instruct Dataset was provided by Karan4D and HueminArt.  
 - GPT4-LLM and Unnatural Instructions were provided by Microsoft
 - Airoboros dataset by jondurbin
 - Camel-AI's domain expert datasets are from Camel-AI
 - CodeAlpaca dataset by Sahil 2801.
 If anyone was left out, please open a thread in the community tab.
 ## Prompt Format
 The model follows the Alpaca prompt format:
 ```
 ### Instruction:
 <prompt>
 ### Response:
 <leave a newline blank for model to respond>
 ```
 or 
 ```
 ### Instruction:
 <prompt>
 ### Input:
 <additional context>
 ### Response:
 <leave a newline blank for model to respond>
 ```
 GPT4All:
 ```|    Task     |Version| Metric |Value |   |Stderr|
 |-------------|------:|--------|-----:|---|-----:|
 |arc_challenge|      0|acc     |0.4735|±  |0.0146|
 |             |       |acc_norm|0.5017|±  |0.0146|
 |arc_easy     |      0|acc     |0.7946|±  |0.0083|
 |             |       |acc_norm|0.7605|±  |0.0088|
 |boolq        |      1|acc     |0.8000|±  |0.0070|
 |hellaswag    |      0|acc     |0.5924|±  |0.0049|
 |             |       |acc_norm|0.7774|±  |0.0042|
 |openbookqa   |      0|acc     |0.3600|±  |0.0215|
 |             |       |acc_norm|0.4660|±  |0.0223|
 |piqa         |      0|acc     |0.7889|±  |0.0095|
 |             |       |acc_norm|0.7976|±  |0.0094|
 |winogrande   |      0|acc     |0.6993|±  |0.0129|
 Average: 0.686
 ```  
 BigBench:
 ```
 |                      Task                      |Version|       Metric        |Value |   |Stderr|
 |------------------------------------------------|------:|---------------------|-----:|---|-----:|
 |bigbench_causal_judgement                       |      0|multiple_choice_grade|0.5579|±  |0.0361|
 |bigbench_date_understanding                     |      0|multiple_choice_grade|0.6233|±  |0.0253|
 |bigbench_disambiguation_qa                      |      0|multiple_choice_grade|0.3062|±  |0.0288|
 |bigbench_geometric_shapes                       |      0|multiple_choice_grade|0.2006|±  |0.0212|
 |                                                |       |exact_str_match      |0.0000|±  |0.0000|
 |bigbench_logical_deduction_five_objects         |      0|multiple_choice_grade|0.2540|±  |0.0195|
 |bigbench_logical_deduction_seven_objects        |      0|multiple_choice_grade|0.1657|±  |0.0141|
 |bigbench_logical_deduction_three_objects        |      0|multiple_choice_grade|0.4067|±  |0.0284|
 |bigbench_movie_recommendation                   |      0|multiple_choice_grade|0.2780|±  |0.0201|
 |bigbench_navigate                               |      0|multiple_choice_grade|0.5000|±  |0.0158|
 |bigbench_reasoning_about_colored_objects        |      0|multiple_choice_grade|0.4405|±  |0.0111|
 |bigbench_ruin_names                             |      0|multiple_choice_grade|0.2701|±  |0.0210|
 |bigbench_salient_translation_error_detection    |      0|multiple_choice_grade|0.2034|±  |0.0127|
 |bigbench_snarks                                 |      0|multiple_choice_grade|0.5028|±  |0.0373|
 |bigbench_sports_understanding                   |      0|multiple_choice_grade|0.6136|±  |0.0155|
 |bigbench_temporal_sequences                     |      0|multiple_choice_grade|0.2720|±  |0.0141|
 |bigbench_tracking_shuffled_objects_five_objects |      0|multiple_choice_grade|0.1944|±  |0.0112|
 |bigbench_tracking_shuffled_objects_seven_objects|      0|multiple_choice_grade|0.1497|±  |0.0085|
 |bigbench_tracking_shuffled_objects_three_objects|      0|multiple_choice_grade|0.4067|±  |0.0284|
 Average: 0.3525
 ```  
 AGIEval
 ```  
 |             Task             |Version| Metric |Value |   |Stderr|
 |------------------------------|------:|--------|-----:|---|-----:|
 |agieval_aqua_rat              |      0|acc     |0.2520|±  |0.0273|
 |                              |       |acc_norm|0.2402|±  |0.0269|
 |agieval_logiqa_en             |      0|acc     |0.2796|±  |0.0176|
 |                              |       |acc_norm|0.3241|±  |0.0184|
 |agieval_lsat_ar               |      0|acc     |0.2478|±  |0.0285|
 |                              |       |acc_norm|0.2348|±  |0.0280|
 |agieval_lsat_lr               |      0|acc     |0.2843|±  |0.0200|
 |                              |       |acc_norm|0.2765|±  |0.0198|
 |agieval_lsat_rc               |      0|acc     |0.3271|±  |0.0287|
 |                              |       |acc_norm|0.3011|±  |0.0280|
 |agieval_sat_en                |      0|acc     |0.4660|±  |0.0348|
 |                              |       |acc_norm|0.4223|±  |0.0345|
 |agieval_sat_en_without_passage|      0|acc     |0.3738|±  |0.0338|
 |                              |       |acc_norm|0.3447|±  |0.0332|
 |agieval_sat_math              |      0|acc     |0.2500|±  |0.0293|
 |                              |       |acc_norm|0.2364|±  |0.0287|
 Average: 0.2975
 ```  
 ## Benchmark Results
 ## Resources for Applied Use Cases:
 For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord  
 For an example of a roleplaying discord chatbot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot  
 LM Studio is a good choice for a chat interface that supports GGML versions (to come)
 ## Future Plans
 We plan to continue to iterate on both more high quality data, and new data filtering techniques to eliminate lower quality data going forward. 
 ## Model Usage
 The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
--- a/added_tokens.json
+++ b/added_tokens.json
@@ -0,0 +1,3 @@
 {
  "<pad>": 32000
 }
--- a/config.json
+++ b/config.json
@@ -0,0 +1,26 @@
 {
  "_name_or_path": "output/hermes-llama2-4k/checkpoint-2259",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 11008,
  "max_position_embeddings": 4096,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 32,
  "pad_token_id": 0,
  "pretraining_tp": 1,
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.32.0.dev0",
  "use_cache": false,
  "vocab_size": 32000
 }
--- a/configuration.json
+++ b/configuration.json
@@ -0,0 +1 @@
 {"framework": "pytorch", "task": "text-generation", "allow_remote": true}
--- a/generation_config.json
+++ b/generation_config.json
@@ -0,0 +1,9 @@
 {
  "_from_model_config": true,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "pad_token_id": 0,
  "temperature": 0.9,
  "top_p": 0.6,
  "transformers_version": "4.32.0.dev0"
 }
--- a/model.safetensors
+++ b/model.safetensors
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:23d88a682b353e48ccedfcb63b9f58ffa634267d2e4311079fe1fa3e53161219
 size 13476865232
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
@@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:fbf5845c2a382e91ccb844d39463d118b1edbb749dfcf87beee8c20e8cd49d3b
 size 13476978469
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@@ -0,0 +1,24 @@
 {
  "bos_token": {
    "content": "<s>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "eos_token": {
    "content": "</s>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "pad_token": "<unk>",
  "unk_token": {
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }
 }
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer.model
+++ b/tokenizer.model
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@@ -0,0 +1,32 @@
 {
  "bos_token": {
    "__type": "AddedToken",
    "content": "<s>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "clean_up_tokenization_spaces": false,
  "eos_token": {
    "__type": "AddedToken",
    "content": "</s>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "legacy": false,
  "model_max_length": 1000000000000000019884624838656,
  "pad_token": null,
  "sp_model_kwargs": {},
  "tokenizer_class": "LlamaTokenizer",
  "unk_token": {
    "__type": "AddedToken",
    "content": "<unk>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  }
 }
		`@@ -0,0 +1 @@`
							`{"framework": "pytorch", "task": "text-generation", "allow_remote": true}`