初始化项目,由ModelHub XC社区提供模型

Model: mrfakename/WizardChatML-7B-v0
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-19 07:14:25 +08:00
commit c1e705813d
14 changed files with 91664 additions and 0 deletions

47
.gitattributes vendored Normal file
View File

@@ -0,0 +1,47 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

38
README.md Normal file
View File

@@ -0,0 +1,38 @@
---
license: other
license_name: apache-2.0-mostly
language:
- en
pipeline_tag: text-generation
---
# WizardChatML 7B v0
[GGUF Quants](https://huggingface.co/mrfakename/WizardChatML-7B-v0-Q4_K_M-GGUF)
I personally think ChatML is the best prompt format. It allows:
1. Easier templating for generation
2. Lower risk of inadvertently generating role tokens
3. Better long-context performance and higher quality on quantized models
This model is an experiment attempting to extend WizardLM 2 7B to ChatML. It was trained on a small ChatML dataset, it probably isn't as good as WizardLM 2 Base, but it's an attempt.
Aside from using the ChatML prompt format, this model supports system prompts. In fact, it adheres *very* well to these prompts.
If you want to use this model for task-specific purposes, you should probably fine-tune it.
## Capabilities & Challenges
* Seems ok-ish at writing
* Pretty good at math
* Sometimes calls itself ChatGPT/OpenAI
## Risks
It has not been trained on guardrail data and may generate offensive content if prompted.
## License
If you use this model, you must include the Apache 2.0 license AND the following notice:
I'm releasing this model under the Apache 2.0 license, with the added restriction that it cannot be used to compete with OpenAI (due to the nature of the training data). Additionally, this model was finetuned from the WizardLM 2 7B model, which was recently removed by Microsoft (it was Apache licensed, but may have been trained on NC-licensed data). You are responsible for the usage of this model. You are responsible for checking that your usage of this model is legal in your jurisdiction. Commercial use is not advised, as this model is finetuned from a model that may have been trained on NC-licensed data. Make sure to consult a lawyer before using in production or commercially.

4
added_tokens.json Normal file
View File

@@ -0,0 +1,4 @@
{
"<|im_end|>": 32001,
"<|im_start|>": 32000
}

26
config.json Normal file
View File

@@ -0,0 +1,26 @@
{
"_name_or_path": "dreamgen/WizardLM-2-7B",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"sliding_window": 4096,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.0.dev0",
"use_cache": false,
"vocab_size": 32002
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

6
generation_config.json Normal file
View File

@@ -0,0 +1,6 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"eos_token_id": 2,
"transformers_version": "4.40.0.dev0"
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d07cbee8dfe1b5538c7f96c3d0b33ef60db29f1b53c8166fbb8a7325c0d6041d
size 4943178720

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2520d7619b92b6cb081bfb112e1af744887516689596b633a5f7668150434699
size 4999819336

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:11002caa3cf4641bf597c5095ff6d51c7722f26555c755a8a0f545b7ff5104a5
size 4540532728

View File

@@ -0,0 +1,298 @@
{
"metadata": {
"total_size": 14483496960
},
"weight_map": {
"lm_head.weight": "model-00003-of-00003.safetensors",
"model.embed_tokens.weight": "model-00001-of-00003.safetensors",
"model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.10.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.10.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.10.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.22.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.22.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.22.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.22.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
"model.layers.23.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.23.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.23.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.24.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.26.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.26.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.27.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.27.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.28.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
"model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
"model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
"model.norm.weight": "model-00003-of-00003.safetensors"
}
}

30
special_tokens_map.json Normal file
View File

@@ -0,0 +1,30 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

91140
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

62
tokenizer_config.json Normal file
View File

@@ -0,0 +1,62 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": true,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32000": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
},
"32001": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": false
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 8192,
"pad_token": "<unk>",
"padding_side": "right",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false,
"use_fast": true
}