初始化项目,由ModelHub XC社区提供模型
Model: LeroyDyer/Mixtral_AI_Cyber_Matrix_2_0 Source: Original Platform
This commit is contained in:
35
.gitattributes
vendored
Normal file
35
.gitattributes
vendored
Normal file
@@ -0,0 +1,35 @@
|
||||
*.7z filter=lfs diff=lfs merge=lfs -text
|
||||
*.arrow filter=lfs diff=lfs merge=lfs -text
|
||||
*.bin filter=lfs diff=lfs merge=lfs -text
|
||||
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
||||
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
||||
*.ftz filter=lfs diff=lfs merge=lfs -text
|
||||
*.gz filter=lfs diff=lfs merge=lfs -text
|
||||
*.h5 filter=lfs diff=lfs merge=lfs -text
|
||||
*.joblib filter=lfs diff=lfs merge=lfs -text
|
||||
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
||||
*.model filter=lfs diff=lfs merge=lfs -text
|
||||
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
||||
*.npy filter=lfs diff=lfs merge=lfs -text
|
||||
*.npz filter=lfs diff=lfs merge=lfs -text
|
||||
*.onnx filter=lfs diff=lfs merge=lfs -text
|
||||
*.ot filter=lfs diff=lfs merge=lfs -text
|
||||
*.parquet filter=lfs diff=lfs merge=lfs -text
|
||||
*.pb filter=lfs diff=lfs merge=lfs -text
|
||||
*.pickle filter=lfs diff=lfs merge=lfs -text
|
||||
*.pkl filter=lfs diff=lfs merge=lfs -text
|
||||
*.pt filter=lfs diff=lfs merge=lfs -text
|
||||
*.pth filter=lfs diff=lfs merge=lfs -text
|
||||
*.rar filter=lfs diff=lfs merge=lfs -text
|
||||
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
||||
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
||||
*.tar filter=lfs diff=lfs merge=lfs -text
|
||||
*.tflite filter=lfs diff=lfs merge=lfs -text
|
||||
*.tgz filter=lfs diff=lfs merge=lfs -text
|
||||
*.wasm filter=lfs diff=lfs merge=lfs -text
|
||||
*.xz filter=lfs diff=lfs merge=lfs -text
|
||||
*.zip filter=lfs diff=lfs merge=lfs -text
|
||||
*.zst filter=lfs diff=lfs merge=lfs -text
|
||||
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
||||
133
README.md
Normal file
133
README.md
Normal file
@@ -0,0 +1,133 @@
|
||||
---
|
||||
base_model:
|
||||
- LeroyDyer/Mixtral_AI_Multi_TEST
|
||||
- LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0
|
||||
- LeroyDyer/Mixtral_AI_CyberLAW
|
||||
- LeroyDyer/Mixtral_AI_CyberBrain_3_0
|
||||
- LeroyDyer/Mixtral_AI_Cyber_5.0
|
||||
- LeroyDyer/Mixtral_AI_CyberBrain_2.0
|
||||
- ezelikman/quietstar-8-ahead
|
||||
- KoboldAI/Mistral-7B-Erebus-v3
|
||||
library_name: transformers
|
||||
tags:
|
||||
- mergekit
|
||||
- megamerge
|
||||
- code
|
||||
- Cyber-Series
|
||||
license: mit
|
||||
language:
|
||||
- en
|
||||
datasets:
|
||||
- Open-Orca/OpenOrca
|
||||
- cognitivecomputations/dolphin
|
||||
- WhiteRabbitNeo/WRN-Chapter-2
|
||||
- WhiteRabbitNeo/WRN-Chapter-1
|
||||
- gate369/Alpaca-Star
|
||||
- gate369/alpaca-star-ascii
|
||||
---
|
||||
|
||||
<img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
|
||||
https://github.com/spydaz
|
||||
|
||||
Currently undegoing Fine tuning ! as this model contains all Previous models !
|
||||
|
||||
|
||||
This model contains many hidden tensors :
|
||||
As it was emrged with many lora adapter for various task such as vision and sound .
|
||||
The problem was that for some reason i could not get the extra heads to show up like other models.
|
||||
such as the llava model ... i suppose this model can change the config.json to be a llava model and yes ! it works! ie it can think and has hidden think heads ? but you need to config it up !, It has vision heads but also i could not set the config up !
|
||||
so hidden talents:
|
||||
It was also merged with the mothers of these models for QUiet(thoughts) and (llava vision etc ) so the tensors are there . i just did not understand how to fine tne the addtional funcitonalitys. as they need a single trainign example to populate the hidden tensor hence te merges. and yet when the model is put in train mode , ie by setting the model after loading to model.TRAIN ... the tensors apear waiting for training so just add a peft and start the training!
|
||||
|
||||
|
||||
THIS VERSION HAS BEEN UPDATED TO INCLUDE CYBERBRAIN ! (Hidden Tensors)
|
||||
|
||||
## Extended capabilities:
|
||||
* mistralai/Mistral-7B-Instruct-v0.1 - Prime-Base
|
||||
|
||||
* ChaoticNeutrals/Eris-LelantaclesV2-7b - role play
|
||||
|
||||
* ChaoticNeutrals/Eris_PrimeV3-Vision-7B - vision
|
||||
|
||||
* rvv-karma/BASH-Coder-Mistral-7B - coding
|
||||
|
||||
* Locutusque/Hercules-3.1-Mistral-7B - Unhinging
|
||||
|
||||
* KoboldAI/Mistral-7B-Erebus-v3 - NSFW
|
||||
|
||||
* Locutusque/Hyperion-2.1-Mistral-7B - CHAT
|
||||
|
||||
* Severian/Nexus-IKM-Mistral-7B-Pytorch - Thinking
|
||||
|
||||
* NousResearch/Hermes-2-Pro-Mistral-7B - Generalizing
|
||||
|
||||
* mistralai/Mistral-7B-Instruct-v0.2 - BASE
|
||||
|
||||
* Nitral-AI/ProdigyXBioMistral_7B - medical
|
||||
|
||||
* Nitral-AI/Infinite-Mika-7b - 128k - Context Expansion enforcement
|
||||
|
||||
* Nous-Yarn-Mistral-7b-128k - 128k - Context Expansion
|
||||
|
||||
* yanismiraoui/Yarn-Mistral-7b-128k-sharded
|
||||
|
||||
* ChaoticNeutrals/Eris_Prime-V2-7B - Roleplay
|
||||
|
||||
|
||||
This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency.
|
||||
|
||||
Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy.
|
||||
|
||||
Among its illustrious sub-models, the OpenOrca - Mistral-7B-8k shines as a testament to fine-tuning excellence, boasting top-ranking performance in its class. Meanwhile, the Hermes 2 Pro introduces cutting-edge capabilities such as Function Calling and JSON Mode, catering to diverse application needs.
|
||||
|
||||
Driven by Reinforcement Learning from AI Feedback, the Starling-LM-7B-beta demonstrates remarkable adaptability and optimization, while the Phi-1.5 Transformer model stands as a beacon of excellence across various domains, from common sense reasoning to medical inference.
|
||||
|
||||
With models like BioMistral tailored specifically for medical applications and Nous-Yarn-Mistral-7b-128k excelling in handling long-context data, the MEGA_MIND 24b CyberSeries emerges as a transformative force in the landscape of language understanding and artificial intelligence.
|
||||
|
||||
Experience the future of language models with the MEGA_MIND 24b CyberSeries, where innovation meets performance, and possibilities are limitless.
|
||||
### Models Merged
|
||||
|
||||
The following models were included in the merge:
|
||||
* [LeroyDyer/Mixtral_AI_Multi_TEST](https://huggingface.co/LeroyDyer/Mixtral_AI_Multi_TEST)
|
||||
* [LeroyDyer/Mixtral_AI_CyberLAW](https://huggingface.co/LeroyDyer/Mixtral_AI_CyberLAW)
|
||||
* [LeroyDyer/Mixtral_AI_CyberBrain_3_0](https://huggingface.co/LeroyDyer/Mixtral_AI_CyberBrain_3_0)
|
||||
* [LeroyDyer/Mixtral_AI_Cyber_5.0](https://huggingface.co/LeroyDyer/Mixtral_AI_Cyber_5.0)
|
||||
|
||||
### Configuration
|
||||
|
||||
The following YAML configuration was used to produce this model:
|
||||
|
||||
```yaml
|
||||
|
||||
models:
|
||||
- model: LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0
|
||||
parameters:
|
||||
density: [0.256, 0.512, 0.128] # density gradient
|
||||
weight: 0.382
|
||||
- model: LeroyDyer/Mixtral_AI_CyberLAW
|
||||
parameters:
|
||||
density: 0.382
|
||||
weight: [0.256, 0.128, 0.256, 0.128] # weight gradient
|
||||
- model: LeroyDyer/Mixtral_AI_CyberBrain_3_0
|
||||
parameters:
|
||||
density: 0.382
|
||||
weight: [0.128, 0.512, 0.128, 0.128] # weight gradient
|
||||
- model: LeroyDyer/Mixtral_AI_Multi_TEST
|
||||
parameters:
|
||||
density: 0.382
|
||||
weight: [0.128, 0.512, 0.128, 0.128] # weight gradient
|
||||
- model: LeroyDyer/Mixtral_AI_Cyber_5.0
|
||||
parameters:
|
||||
density: 0.382
|
||||
weight:
|
||||
- filter: mlp
|
||||
value: 0.5
|
||||
- value: 0
|
||||
merge_method: ties
|
||||
base_model: LeroyDyer/Mixtral_AI_Cyber_Dolphin_2.0
|
||||
parameters:
|
||||
normalize: true
|
||||
int8_mask: true
|
||||
dtype: float16
|
||||
|
||||
```
|
||||
28
config.json
Normal file
28
config.json
Normal file
@@ -0,0 +1,28 @@
|
||||
{
|
||||
"_name_or_path": "LeroyDyer/Mixtral_AI_Cyber_Matrix_2_0",
|
||||
"architectures": [
|
||||
"MistralForCausalLM"
|
||||
],
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"hidden_act": "silu",
|
||||
"hidden_size": 4096,
|
||||
"initializer_range": 0.02,
|
||||
"intermediate_size": 14336,
|
||||
"max_position_embeddings": 32768,
|
||||
"model_type": "mistral",
|
||||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 32,
|
||||
"num_key_value_heads": 8,
|
||||
"pad_token_id": 2,
|
||||
"rms_norm_eps": 1e-05,
|
||||
"rope_theta": 10000.0,
|
||||
"sliding_window": 4096,
|
||||
"tie_word_embeddings": false,
|
||||
"torch_dtype": "float16",
|
||||
"transformers_version": "4.38.2",
|
||||
"unsloth_version": "2024.3",
|
||||
"use_cache": true,
|
||||
"vocab_size": 32000
|
||||
}
|
||||
7
generation_config.json
Normal file
7
generation_config.json
Normal file
@@ -0,0 +1,7 @@
|
||||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 1,
|
||||
"eos_token_id": 2,
|
||||
"pad_token_id": 2,
|
||||
"transformers_version": "4.38.2"
|
||||
}
|
||||
10
mergekit_config.yml
Normal file
10
mergekit_config.yml
Normal file
@@ -0,0 +1,10 @@
|
||||
|
||||
models:
|
||||
- model: LeroyDyer/Mixtral_AI_Cyber_Matrix
|
||||
parameters:
|
||||
weight: 0.512
|
||||
- model: LeroyDyer/Mixtral_AI_CyberBrain_3_0
|
||||
parameters:
|
||||
weight: 0.512
|
||||
merge_method: linear
|
||||
dtype: float16
|
||||
3
model-00001-of-00003.safetensors
Normal file
3
model-00001-of-00003.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:bc1058a0274a778e649614fe1b3efbfc853774d7705b0a4b02eeb51f55649592
|
||||
size 5899498872
|
||||
3
model-00002-of-00003.safetensors
Normal file
3
model-00002-of-00003.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:48659aa5a88ad5edb6156937c9f8636ff0959abbda9572812c9a7b8b6f8b342c
|
||||
size 5905823392
|
||||
3
model-00003-of-00003.safetensors
Normal file
3
model-00003-of-00003.safetensors
Normal file
@@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:3970885bd74bc896dcd2f3fabd57382c92632256d44d41e9431dc549cdfde273
|
||||
size 2678175464
|
||||
1
model.safetensors.index.json
Normal file
1
model.safetensors.index.json
Normal file
File diff suppressed because one or more lines are too long
30
special_tokens_map.json
Normal file
30
special_tokens_map.json
Normal file
@@ -0,0 +1,30 @@
|
||||
{
|
||||
"bos_token": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"eos_token": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"pad_token": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
},
|
||||
"unk_token": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false
|
||||
}
|
||||
}
|
||||
91122
tokenizer.json
Normal file
91122
tokenizer.json
Normal file
File diff suppressed because it is too large
Load Diff
BIN
tokenizer.model
(Stored with Git LFS)
Normal file
BIN
tokenizer.model
(Stored with Git LFS)
Normal file
Binary file not shown.
44
tokenizer_config.json
Normal file
44
tokenizer_config.json
Normal file
@@ -0,0 +1,44 @@
|
||||
{
|
||||
"add_bos_token": true,
|
||||
"add_eos_token": false,
|
||||
"added_tokens_decoder": {
|
||||
"0": {
|
||||
"content": "<unk>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"1": {
|
||||
"content": "<s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
},
|
||||
"2": {
|
||||
"content": "</s>",
|
||||
"lstrip": false,
|
||||
"normalized": false,
|
||||
"rstrip": false,
|
||||
"single_word": false,
|
||||
"special": true
|
||||
}
|
||||
},
|
||||
"additional_special_tokens": [],
|
||||
"bos_token": "<s>",
|
||||
"chat_template": "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token}}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}",
|
||||
"clean_up_tokenization_spaces": false,
|
||||
"eos_token": "</s>",
|
||||
"legacy": true,
|
||||
"model_max_length": 32768,
|
||||
"pad_token": "<unk>",
|
||||
"padding_side": "right",
|
||||
"sp_model_kwargs": {},
|
||||
"spaces_between_special_tokens": false,
|
||||
"tokenizer_class": "LlamaTokenizer",
|
||||
"unk_token": "<unk>",
|
||||
"use_default_system_prompt": false
|
||||
}
|
||||
Reference in New Issue
Block a user