初始化项目,由ModelHub XC社区提供模型

Model: louisbrulenaudet/Pearl-3x7B
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-04-11 08:53:03 +08:00
commit 62d150aeb6
30 changed files with 460 additions and 0 deletions

69
.gitattributes vendored Normal file
View File

@@ -0,0 +1,69 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bin.* filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zstandard filter=lfs diff=lfs merge=lfs -text
*.tfevents* filter=lfs diff=lfs merge=lfs -text
*.db* filter=lfs diff=lfs merge=lfs -text
*.ark* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*data* filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.meta filter=lfs diff=lfs merge=lfs -text
**/*ckpt*.index filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.gguf* filter=lfs diff=lfs merge=lfs -text
*.ggml filter=lfs diff=lfs merge=lfs -text
*.llamafile* filter=lfs diff=lfs merge=lfs -text
*.pt2 filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
model-00012-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00014-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00004-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
tokenizer.json filter=lfs diff=lfs merge=lfs -text
model-00016-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00015-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00006-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00009-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00003-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00005-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00010-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00008-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00018-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00013-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00001-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00017-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00002-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00007-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00019-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
model-00011-of-00019.safetensors filter=lfs diff=lfs merge=lfs -text
tokenizer.model filter=lfs diff=lfs merge=lfs -text

146
README.md Normal file
View File

@@ -0,0 +1,146 @@
---
license: apache-2.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
- dvilasuero/DistilabelBeagle14-7B
- beowolx/CodeNinja-1.0-OpenChat-7B
- WizardLM/WizardMath-7B-V1.1
- Maths
- Code
- Python
base_model:
- dvilasuero/DistilabelBeagle14-7B
- beowolx/CodeNinja-1.0-OpenChat-7B
- WizardLM/WizardMath-7B-V1.1
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
<center><img src='https://i.imgur.com/0xFTuAX.png' width='450px'></center>
# Pearl-3x7B, an xtraordinary Mixture of Experts (MoE) for data science
Pearl-3x7B is a Mixture of Experts (MoE) made with the following models :
* [dvilasuero/DistilabelBeagle14-7B](https://huggingface.co/dvilasuero/DistilabelBeagle14-7B)
* [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B)
* [WizardLM/WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
A Mixture of Experts (MoE) model represents a sophisticated architecture that amalgamates the capabilities of multiple specialized models to address a wide array of tasks within a unified framework. Within the realm of a MoE model tailored for a chat application, the integration of expertise spanning three distinct domains - chat, code, and mathematics - substantially enhances its capacity to furnish nuanced and precise responses to a diverse spectrum of user inquiries.
The initial expert model, honed for chat applications, exhibits prowess in comprehending natural language nuances, conversational dynamics, and contextual cues. Drawing upon extensive conversational data, it adeptly generates engaging and contextually pertinent responses, thereby fostering meaningful interactions with users.
The subsequent expert model, centered on code, brings to the fore proficiency in programming languages, algorithms, and software engineering principles. Possessing a deep-seated understanding of syntax, logical constructs, and problem-solving methodologies, it deftly tackles queries spanning coding challenges, debugging assistance, and software development inquiries.
Lastly, the third expert model, specializing in mathematics, boasts expertise in mathematical reasoning, problem-solving strategies, and analytical techniques. Armed with a breadth of knowledge encompassing arithmetic, algebra, calculus, and beyond, it offers precise solutions, lucid explanations, and profound insights for mathematical queries, equations, and proofs.
## Configuration
```yaml
base_model: argilla/CapybaraHermes-2.5-Mistral-7B
experts:
- source_model: dvilasuero/DistilabelBeagle14-7B
positive_prompts:
- "chat"
- "assistant"
- "tell me"
- "explain"
- "help"
- "guide"
- "assist"
- "answer"
- "support"
- "clarify"
- "elaborate"
- "educate"
- "inform"
- "advise"
- "instruct"
- source_model: beowolx/CodeNinja-1.0-OpenChat-7B
positive_prompts:
- "code"
- "python"
- "javascript"
- "programming"
- "algorithm"
- "develop"
- "debug"
- "optimize"
- "software"
- "engineer"
- "web"
- "application"
- "framework"
- "library"
- "syntax"
- "logic"
- "compile"
- "execute"
- source_model: WizardLM/WizardMath-7B-V1.1
positive_prompts:
- "reason"
- "math"
- "mathematics"
- "solve"
- "count"
- "calculate"
- "analyze"
- "derive"
- "compute"
- "numbers"
- "equation"
- "theorem"
- "proof"
- "geometry"
- "trigonometry"
- "statistics"
- "probability"
- "algebra"
- "integral"
```
## Usage
```python
!pip install -qU transformers bitsandbytes accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "louisbrulenaudet/Pearl-3x7B"
tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)
messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
## Citing & Authors
If you use this code in your research, please use the following BibTeX entry.
```BibTeX
@misc{louisbrulenaudet2023,
author = {Louis Brulé Naudet},
title = {Pearl-3x7B, an xtraordinary Mixture of Experts (MoE) for data science},
year = {2023}
howpublished = {\url{https://huggingface.co/louisbrulenaudet/Pearl-3x7B}},
}
```
## Feedback
If you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).

4
added_tokens.json Normal file
View File

@@ -0,0 +1,4 @@
{
"<|im_end|>": 32000,
"<|im_start|>": 32001
}

30
config.json Normal file
View File

@@ -0,0 +1,30 @@
{
"_name_or_path": "argilla/CapybaraHermes-2.5-Mistral-7B",
"architectures": [
"MixtralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 32000,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mixtral",
"num_attention_heads": 32,
"num_experts_per_tok": 2,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"num_local_experts": 3,
"output_router_logits": false,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"router_aux_loss_coef": 0.001,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "float16",
"transformers_version": "4.37.2",
"use_cache": false,
"vocab_size": 32002
}

1
configuration.json Normal file
View File

@@ -0,0 +1 @@
{"framework": "pytorch", "task": "text-generation", "allow_remote": true}

61
mergekit_moe_config.yml Normal file
View File

@@ -0,0 +1,61 @@
base_model: argilla/CapybaraHermes-2.5-Mistral-7B
experts:
- source_model: dvilasuero/DistilabelBeagle14-7B
positive_prompts:
- "chat"
- "assistant"
- "tell me"
- "explain"
- "help"
- "guide"
- "assist"
- "answer"
- "support"
- "clarify"
- "elaborate"
- "educate"
- "inform"
- "advise"
- "instruct"
- source_model: beowolx/CodeNinja-1.0-OpenChat-7B
positive_prompts:
- "code"
- "python"
- "javascript"
- "programming"
- "algorithm"
- "develop"
- "debug"
- "optimize"
- "software"
- "engineer"
- "web"
- "application"
- "framework"
- "library"
- "syntax"
- "logic"
- "compile"
- "execute"
- source_model: WizardLM/WizardMath-7B-V1.1
positive_prompts:
- "reason"
- "math"
- "mathematics"
- "solve"
- "count"
- "calculate"
- "analyze"
- "derive"
- "compute"
- "numbers"
- "equation"
- "theorem"
- "proof"
- "geometry"
- "trigonometry"
- "statistics"
- "probability"
- "algebra"
- "integral"

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9fd98130c520239927ba4fb3eec2ac5db398482fa64085bc73e0b53e5710c620
size 1933882656

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:f582751fd1e3f380a5261210a9bd303604065d6d38f10d468805ddc755986521
size 1996490936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3e072b7a97d69997a61aeaec63f7b8bc1470cd68a3bd1d1c786889b45413b75a
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:5225a7fe639cccbd1aff1d85be606583d5c7fc363e8d062a1c79a5ea25cac7c9
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2f308145891966075878f1b5f37cd8c3e5a193f4a82b09ef384e2310fa7b46a7
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:609237bb2ba0bdc1a0c2528b681bbc1d463f3de3d04c687177f154f928b46a2a
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:96d41d9c04c32231e3f431f9aeabb79a1bae6c64380c94ac4021cd77dbc17fbd
size 1996490936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1442748cdd7cc9bc39bd068f5119f9481800008ed34402a10b083a7ab6796e54
size 1996490936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8a2e9de2c2509a207edd82115af511f122c7abfdf0758501867b1118e6391d41
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7e2ab43dac0f23b30731f0b36ed7feceabe0a692baef03c5c57fb9dc431043de
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:ad78496cc40362f784b50e7962d43d251ab2585ac7d29a05512863671d72e9f0
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:baed25bf28aa66c718aaa6b541b62a49abb09099bc0569a634c05a14a188453d
size 1996490944

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b86739adf7f1c0d8e2e3d686d4e76b1d7aca41087b4656b958643b8b8beba8f4
size 1996490936

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b346100083a795825011b0330ff289055e23cf4fcd75d3ad2011b9ac5d4922d1
size 1996490944

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:9281a5a7cd4d0578a482bfb54e86a49881f582994fe57b2329b96e46305abb3f
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:bb67656a258659da37da4c06c887208396967780d4cacd2bd41302a3a49ff91e
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:2dafa2241d8db920255bb5be263a4bf6752ff6e913d31cbb17b0246e48e8c20a
size 1996490952

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:eeb423411a57d2058bfe0270425cfbde22939f7c06dc1c2d8fdfc29dd85a29a1
size 1996765088

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:62ccf1aff3a6c89117a4fbc0096db353e3389a01bf38bc0bdb9693f41ccbc70e
size 1158422928

File diff suppressed because one or more lines are too long

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "<s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

3
tokenizer.json Normal file
View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:c7cf57dadebd4785143c06f48ffeec2ec498062ef26c5c527152805cbaac14d8
size 1795829

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

61
tokenizer_config.json Normal file
View File

@@ -0,0 +1,61 @@
{
"add_bos_token": true,
"add_eos_token": false,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32000": {
"content": "<|im_end|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"32001": {
"content": "<|im_start|>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "<s>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"trust_remote_code": false,
"unk_token": "<unk>",
"use_default_system_prompt": true,
"use_fast": true
}