初始化项目,由ModelHub XC社区提供模型

Model: davzoku/frankencria-llama2-12.5b-v1.3-m.2
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-06 07:59:45 +08:00
commit 74d62227c5
23 changed files with 93702 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

86
README.md Normal file
View File

@@ -0,0 +1,86 @@
---
inference: false
language: en
license: llama2
model_type: llama
datasets:
- mlabonne/CodeLlama-2-20k
pipeline_tag: text-generation
base_model:
- davzoku/cria-llama2-7b-v1.3
library_name: transformers
tags:
- mergekit
- merge
- llama-2
---
# FrankenCRIA v1.3-m.2
## What is FrankenCRIA?
<p align="center">
<img src="https://github.com/davzoku/cria/blob/main/assets/frankencria-icon-512x512.png?raw=true" width="300" height="300" alt="FrankenCRIA Logo"> <br>
<i>This is a frankenmerge of <a href="https://huggingface.co/davzoku/cria-llama2-7b-v1.3">davzoku/cria-llama2-7b-v1.3</a>.</i>
</p>
The configuration is the same as [vilm/vinallama-12.5b-chat-DUS](https://huggingface.co/vilm/vinallama-12.5b-chat-DUS).
Please be aware that this model is highly experimental, and no further training has been conducted following the merge.
Therefore, the model performance may not meet expectations, as described in the [SOLAR paper](https://arxiv.org/abs/2312.15166)
## 📦 FrankenCRIA Model Release
FrankenCRIA v1.3 comes with several variants.
- [davzoku/frankencria-llama2-11b-v1.3-m.1](https://huggingface.co/davzoku/frankencria-llama2-11b-v1.3-m.1): 11B FrankenMerge inspired by [Undi95/Mistral-11B-v0.1](https://huggingface.co/Undi95/Mistral-11B-v0.1)
- [davzoku/frankencria-llama2-12.5b-v1.3-m.2](https://huggingface.co/davzoku/frankencria-llama2-12.5b-v1.3-m.2): 12.5B interleaving FrankenMerge inspired by [vilm/vinallama-12.5b-chat-DUS](https://huggingface.co/vilm/vinallama-12.5b-chat-DUS)
## 🧩 Merge Details
### Merge Method
This model was merged using the passthrough merge method.
### Models Merged
The following models were included in the merge:
* [davzoku/cria-llama2-7b-v1.3](https://huggingface.co/davzoku/cria-llama2-7b-v1.3)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
# https://huggingface.co/vilm/vinallama-12.5b-chat-DUS
slices:
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [0, 16]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [8, 16]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [8, 16]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [16, 24]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [16, 24]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [24, 28]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [24, 28]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [28, 32]
merge_method: passthrough
dtype: bfloat16
```

3
added_tokens.json Normal file
View File

@@ -0,0 +1,3 @@
{
"<pad>": 32000
}

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"_name_or_path": "davzoku/cria-llama2-7b-v1.3",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 4096,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 60,
"num_key_value_heads": 32,
"pad_token_id": 0,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 10000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.37.2",
"use_cache": true,
"vocab_size": 32000
}

28
mergekit_config.yml Normal file
View File

@@ -0,0 +1,28 @@
# https://huggingface.co/vilm/vinallama-12.5b-chat-DUS
slices:
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [0, 16]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [8, 16]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [8, 16]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [16, 24]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [16, 24]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [24, 28]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [24, 28]
- sources:
- model: davzoku/cria-llama2-7b-v1.3
layer_range: [28, 32]
merge_method: passthrough
dtype: bfloat16

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:aca50ee464f25993d854dabec72ae82d858210cffe320021b6c88faa8047ed79
size 1979781688

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:38f886bca8993f0a51f9e9bf2aec4b0b13cb11cd94bacac1c45d2a8aff1c8342
size 1967215752

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c735f1679e4ef078fafc38608ae47f465e0c1456e6610fc8f1fe822874df83e5
size 1990284288

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8d65e02bb4c93f3b3ce934022c818aba54a5b878ee2fc3362b93f44bcf4d873b
size 1933661184

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4f9af6c8b6adf6e9e24e3398426f50a95f8bcf35ecd9636a1c16e5c3e67b7491
size 1979790008

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b740dcefb296e5f4db654a12ad9ab48afbc8bd88023a41ecc8fc061642c83b33
size 1933652880

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d1043ad3301fed76d0d717e7541e63527a7d05bb7bbaa88d58c68aed6310ecfd
size 1977718344

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:02b9e84e7566dbe8a93a6a103783511784630ab8dd7e1d023c5134e3d967bc62
size 1935724512

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d50f8378a9b534408eb4b434a36b613c84685b94c64e21f326d0fd54006efe9
size 1977726600

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:06df8f427a9c11645664e2aa5fe13438b46e4c3e06f97b23cb327cfcb8a1bf71
size 1753297528

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:84efd33a923751217b8b135ee07b4530fc239c49ec5564694098e8475c02fbbd
size 1918955112

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:4bad44acf169e755d69e7abdfca1ab39c8f431f28c04eadba7d9f691420d2713
size 1933661184

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:265138a9c0e6d77f81e8f233e4678d67b15e81b97d6dee5daa495f47cd48a80f
size 1528893416

File diff suppressed because one or more lines are too long

30
special_tokens_map.json Normal file
View File

@@ -0,0 +1,30 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"pad_token": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false
},
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

93400
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

3
tokenizer.model Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347
size 499723

48
tokenizer_config.json Normal file
View File

@@ -0,0 +1,48 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": true
},
"32000": {
"content": "<pad>",
"lstrip": false,
"normalized": true,
"rstrip": false,
"single_word": false,
"special": false
}
},
"bos_token": "<s>",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": false,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "</s>",
"sp_model_kwargs": {},
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false
}