初始化项目,由ModelHub XC社区提供模型

Model: Goekdeniz-Guelmez/Hyperion-2.0-Mistral-7B-GGUF
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-06-22 06:43:17 +08:00
commit e075330866
32 changed files with 91746 additions and 0 deletions

49
.gitattributes vendored Normal file
View File

@@ -0,0 +1,49 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q5_1.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
hyperion-2.0-mistral-7b.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

93
README.md Normal file
View File

@@ -0,0 +1,93 @@
---
library_name: transformers
tags:
- code
- chemistry
- medical
license: apache-2.0
datasets:
- Locutusque/hyperion-v2.0
language:
- en
---
# Hyperion-2.0-Mistral-7B
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6437292ecd93f4c9a34b0d47/9BU30Mh9bOkO2HRBDF8EE.png)
## Model Details
- **Model Name**: Locutusque/Hyperion-2.0-Mistral-7B
- **Base Model**: mistralai/Mistral-7B-v0.1
- **Publisher**: Locutusque
- **Model Type**: Question answering, conversational AI, code generation, medical text comprehension, mathematical reasoning, logical reasoning.
- **Language**: Multi-domain, English language.
- **License**: Apache-2.0
## Model Description
`Locutusque/Hyperion-2.0-Mistral-7B` is a state-of-the-art language model fine-tuned on the Hyperion-v2.0 dataset for advanced reasoning across scientific domains. This model is designed to handle complex inquiries and instructions, leveraging the diverse and rich information contained in the Hyperion dataset. Its primary use cases include but are not limited to complex question answering, conversational understanding, code generation, medical text comprehension, mathematical reasoning, and logical reasoning.
## Intended Use
This model is intended for researchers and practitioners looking for a powerful tool to tackle challenging problems in scientific domains. It can be used in the following scenarios:
- AI-driven tutoring systems for science, medicine, mathematics, and computer science.
- Assistive tools for professionals requiring fast and accurate domain-specific information retrieval.
- Platforms that require conversational AI capabilities with a focus on technical and scientific reasoning.
- Automation in code generation and understanding complex programming context.
## Training Data
The `Locutusque/Hyperion-2.0-Mistral-7B` model was fine-tuned on the Hyperion-v2.0 dataset, which amalgamates various datasets rich in diversity and complexity, including programming, medical texts, mathematical problems, and reasoning tasks.
## Evaluation Results
0-shot AGIEval
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|---------------------------------|-------|------|-----:|--------|-----:|---|-----:|
|agieval_nous |N/A |none | 0|acc |0.3602|± |0.0929|
| | |none | 0|acc_norm|0.3342|± |0.0764|
| - agieval_aqua_rat | 1|none | 0|acc |0.2402|± |0.0269|
| | |none | 0|acc_norm|0.2441|± |0.0270|
| - agieval_logiqa_en | 1|none | 0|acc |0.2965|± |0.0179|
| | |none | 0|acc_norm|0.3226|± |0.0183|
| - agieval_lsat_ar | 1|none | 0|acc |0.2348|± |0.0280|
| | |none | 0|acc_norm|0.2000|± |0.0264|
| - agieval_lsat_lr | 1|none | 0|acc |0.3667|± |0.0214|
| | |none | 0|acc_norm|0.3373|± |0.0210|
| - agieval_lsat_rc | 1|none | 0|acc |0.4981|± |0.0305|
| | |none | 0|acc_norm|0.4089|± |0.0300|
| - agieval_sat_en | 1|none | 0|acc |0.6359|± |0.0336|
| | |none | 0|acc_norm|0.5777|± |0.0345|
| - agieval_sat_en_without_passage| 1|none | 0|acc |0.3883|± |0.0340|
| | |none | 0|acc_norm|0.3544|± |0.0334|
| - agieval_sat_math | 1|none | 0|acc |0.3500|± |0.0322|
| | |none | 0|acc_norm|0.2682|± |0.0299|
| Groups |Version|Filter|n-shot| Metric |Value | |Stderr|
|------------|-------|------|-----:|--------|-----:|---|-----:|
|agieval_nous|N/A |none | 0|acc |0.3602|± |0.0929|
| | |none | 0|acc_norm|0.3342|± |0.0764|
5-shot AGIEval coming soon.
## How to Use
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Locutusque/Hyperion-1.5-Mistral-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# For a text generation task
input_text = "<|im_start|>user\nWhat are the implications of Einstein's theory of relativity in modern physics?<|im_end|>\n<|im_start|>assistant\n"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
# Generate a response
outputs = model.generate(input_ids, max_length=200, num_return_sequences=1, temperature=0.8, top_p=0.95, top_k=40, repetition_penalty=1.1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Known Limitations
The diversity of the dataset could lead to inconsistencies in the model's responses due to variations in data formatting and annotation quality.
This model is also very compliant, it will respond to any request. Please make sure to build upon this model with DPO if you plan on using it for enterprise-level deployment.
## Licensing Information
This model is released under the Apache-2.0 license.

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "mistralai/Mistral-7B-v0.1",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"sliding_window": 4096,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.38.2",
"use_cache": true,
"vocab_size": 32000
}

6
generation_config.json Normal file
View File

@@ -0,0 +1,6 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"transformers_version": "4.38.2"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d5b7c669c7b9ba7fb4e2631a63bc1898e5b2e521f72165932481604b477e80d
size 2719241984

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8d6b69adcdba7f57702eaf50da5afd00e6fd9d05615f34402e1c7602887a5d8a
size 3822024448

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2d70e5a309e33655a31d2b1149dd28730b6c374d4f67ed2faaeb37e47b49d792
size 3518985984

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:208b8463fea8665f81aa2ab1308c74033c2d3af2c87a9193262adcd1b3f279ee
size 3164567296

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1cf72e9a8be24ccf3eb441192da557d319b4f14b3102763cdae9c6f326100b18
size 4108916480

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f9c58dbca3a60bd66e866400909a0f6da9ba30015978a64db09204a17cc63f33
size 4553316096

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ab6f124652ce8eaa2790725c9ffd23518065a0763a83f92ba563e564a8e97906
size 4368439040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:04165a87f943cbcf52022f66557641b1dcbea460065227397faf5afd6a62a538
size 4140373760

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:481ab6ef188f75489e5e3ca0b1256cc4ab02979fe85cf3fe71cfc087152eb4c3
size 4997715712

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:db234e1f990947f0a77f7b7caef0e8d33eb20f929afe11e5aafde2d2085ea5e5
size 5442115328

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:43e2f7f89ba555bc7d8e551e8086c3184329001fc13c458bca92b6cc7cd20201
size 5131409152

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2ae4d807d76a9b489ee6394c23431126c3587d251949dbdd92585df5199ac790
size 4997715712

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1521f03df4fc15a33107b0335c0648c950d309b7b6e65fbf3bdd9443bc8f8ada
size 5942064896

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:68b54b44e2b390cb2f9bfa1341cf1615fba711845592bb13ed654d5a6148eff4
size 7695857408

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:eb65f9873d50c7536c19e2e07bfc2851a0f99f1a769d53d223236b259777afce
size 14484731584

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5b751e01c9c2cce94e413a3e4311376ca582c60c060bbc6fcb26144985f43d1c
size 1889587040

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:731d805e28e711b8d0e86ed854db30c951aacf41fd5b71acb2183f0cf196c518
size 1946243936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4c280b57e12ed7dc9c8e6c99c510736bba4af26ebeaf0a9ca22f51035a5fadfa
size 1979781432

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8be6cbd4ef46226db5060c4d66a809bebeb4202648e44c41a80e1329a72ffe2c
size 1946243984

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:56e02d29e71224d5156182d411eb4d7a5db176faa6b31c7f68d3126ea4753c63
size 1979781448

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:39f626db9c19e020952f0a6bae687f3dbd85ed6cccce603eae412f3ce40c847b
size 1946243984

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6702c28da231744ecd41228605ef29be7be031e3f8c61ba5bd2209c513adb2a4
size 1979781448

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:36224ebf1959be4dac9593c2a1622dac4b5540c7bf4e6e5b87cb7ae44be8221a
size 815834680

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 14483464192
},
"weight_map": {
"lm_head.weight": "model-00008-of-00008.safetensors",
"model.embed_tokens.weight": "model-00001-of-00008.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00008.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00008.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00008.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00008.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.10.input_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.input_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.12.input_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.13.input_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.input_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.input_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.input_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00004-of-00008.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.17.input_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00004-of-00008.safetensors",
"model.layers.18.input_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.input_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00008.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00008.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.20.input_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00005-of-00008.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.21.input_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00005-of-00008.safetensors",
"model.layers.22.input_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.input_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.input_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.input_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00006-of-00008.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.26.input_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00006-of-00008.safetensors",
"model.layers.27.input_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.input_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.input_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00007-of-00008.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.3.input_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00008.safetensors",
"model.layers.30.input_layernorm.weight": "model-00008-of-00008.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00008-of-00008.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00007-of-00008.safetensors",
"model.layers.31.input_layernorm.weight": "model-00008-of-00008.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00008-of-00008.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00008-of-00008.safetensors",
"model.layers.4.input_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.input_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.input_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.input_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00002-of-00008.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.8.input_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00002-of-00008.safetensors",
"model.layers.9.input_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00003-of-00008.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00003-of-00008.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00003-of-00008.safetensors",
"model.norm.weight": "model-00008-of-00008.safetensors"
}
}

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "</s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

91136
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

42
tokenizer_config.json Normal file
View File

@@ -0,0 +1,42 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "</s>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}