初始化项目,由ModelHub XC社区提供模型

Model: MaziyarPanahi/Calme-4x7B-MoE-v0.2
Source: Original Platform
This commit is contained in:
ModelHub XC
2026-05-29 20:44:00 +08:00
commit a22c11aab4
13 changed files with 91471 additions and 0 deletions

35
.gitattributes vendored Normal file
View File

@@ -0,0 +1,35 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tar filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text

189
README.md Normal file
View File

@@ -0,0 +1,189 @@
---
license: apache-2.0
tags:
- generated_from_trainer
- mistral
- 7b
- calme
- moe
- mixtral
model-index:
- name: Calme-4x7B-MoE-v0.2
results: []
model_name: Calme-4x7B-MoE-v0.2
inference: false
model_creator: MaziyarPanahi
pipeline_tag: text-generation
quantized_by: MaziyarPanahi
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/5fd5e18a90b6dc4633f6d292/LzEf6vvq2qIiys-q7l9Hq.webp" width="550" />
# MaziyarPanahi/Calme-4x7B-MoE-v0.2
## Model Description
Calme-4x7B is a Mixture of Experts (MoE) model, integrating four state-of-the-art Calme-7B models. Essentially, Calme-4x7B is composed of four Calme-7B models that have been individually fine-tuned, featuring two experts per token. This configuration brings the total to over 24 billion parameters. Calme-4x7B models are distinguished by their ability to generate text with exceptional clarity, calmness, and coherence.
### How to Use
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="MaziyarPanahi/Calme-4x7B-MoE-v0.2")
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Calme-4x7B-MoE-v0.2")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Calme-4x7B-MoE-v0.2")
```
### Eval
| Metric | [Mistral-7B Instruct v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | [Calme-7B v0.1](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.1) | [Calme-7B v0.2](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.2) | [Calme-7B v0.3](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.3) | [Calme-7B v0.4](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.4) | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5) | [Calme-4x7B v0.1](https://huggingface.co/MaziyarPanahi/Calme-4x7B-MoE-v0.1) | [Calme-4x7B v0.2](https://huggingface.co/MaziyarPanahi/Calme-4x7B-MoE-v0.2) |
|-----------|--------------------------|-------|-------|-------|-------|-------|------------|------------|
| ARC | 63.14 | 67.24 | 67.75 | 67.49 | 64.85 | 67.58 | 67.15 | 76.66 |
| HellaSwag | 84.88 | 85.57 | 87.52 | 87.57 | 86.00 | 87.26 | 86.89 | 86.84 |
| TruthfulQA| 68.26 | 59.38 | 78.41 | 78.31 | 70.52 | 74.03 | 73.30 | 73.06 |
| MMLU | 60.78 | 64.97 | 61.83 | 61.93 | 62.01 | 62.04 | 62.16 | 62.16 |
| Winogrande| 77.19 | 83.35 | 82.08 | 82.32 | 79.48 | 81.85 | 80.82 | 81.06 |
| GSM8k | 40.03 | 69.29 | 73.09 | 73.09 | 77.79 | 73.54 | 74.53 | 75.66 |
Some extra information to help you pick the right `Calme-7B` model:
| Use Case Category | Recommended Calme-7B Model | Reason |
|-------------------------------------------------|-----------------------------|------------------------------------------------------------------------------------------|
| Educational Tools and Academic Research | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5) | Balanced performance, especially strong in TruthfulQA for accuracy and broad knowledge. |
| Commonsense Reasoning and Natural Language Apps | [Calme-7B v0.2](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.2) or [Calme-7B v0.3](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.3) | High performance in HellaSwag for understanding nuanced scenarios. |
| Trustworthy Information Retrieval Systems | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5) | Highest score in TruthfulQA, indicating reliable factual information provision. |
| Math Educational Software | [Calme-7B v0.4](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.4) | Best performance in GSM8k, suitable for numerical reasoning and math problem-solving. |
| Context Understanding and Disambiguation | [Calme-7B v0.5](https://huggingface.co/MaziyarPanahi/Calme-7B-Instruct-v0.5) | Solid performance in Winogrande, ideal for text with context and pronoun disambiguation. |
### Quantized Models
> I love how GGUF democratizes the use of Large Language Models (LLMs) on commodity hardware, more specifically, personal computers without any accelerated hardware. Because of this, I am committed to converting and quantizing any models I fine-tune to make them accessible to everyone!
- GGUF (2/3/4/5/6/8 bits): [MaziyarPanahi/Calme-4x7B-MoE-v0.2-GGUF](https://huggingface.co/MaziyarPanahi/Calme-4x7B-MoE-v0.2-GGUF)
## Examples
```
<s>[INST] You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
describe about pros and cons of docker system. [/INST]
```
<details>
<summary>Show me the response</summary>
```
```
</details>
```
```
<details>
<summary>Show me the response</summary>
```
```
</details>
```
<s> [INST] Mark is faster than Mary, Mary is faster than Joe. Is Joe faster than Mark? Let's think step by step [/INST]
```
<details>
<summary>Show me the response</summary>
```
```
</details>
```
```
<details>
<summary>Show me the response</summary>
```
```
</details>
```
<s> [INST] explain step by step 25-4*2+3=? [/INST]
```
<details>
<summary>Show me the response</summary>
```
```
</details>
**Multilingual:**
```
<s> [INST] Vous êtes un assistant utile, respectueux et honnête. Répondez toujours de la manière la plus utile possible, tout en étant sûr. Vos réponses ne doivent inclure aucun contenu nuisible, contraire à l'éthique, raciste, sexiste, toxique, dangereux ou illégal. Assurez-vous que vos réponses sont socialement impartiales et de nature positive.
Si une question n'a pas de sens ou n'est pas cohérente d'un point de vue factuel, expliquez pourquoi au lieu de répondre quelque chose d'incorrect. Si vous ne connaissez pas la réponse à une question, veuillez ne pas partager de fausses informations.
Décrivez les avantages et les inconvénients du système Docker.[/INST]
```
<details>
<summary>Show me the response</summary>
```
```
<details>
<summary>Show me the response</summary>
```
```
</details>
```
<s>[INST] Ви - корисний, поважний та чесний помічник. Завжди відповідайте максимально корисно, будучи безпечним. Ваші відповіді не повинні містити шкідливого, неетичного, расистського, сексистського, токсичного, небезпечного або нелегального контенту. Будь ласка, переконайтеся, що ваші відповіді соціально неупереджені та мають позитивний характер.
Якщо питання не має сенсу або не є фактично послідовним, поясніть чому, замість того, щоб відповідати щось некоректне. Якщо ви не знаєте відповіді на питання, будь ласка, не діліться неправдивою інформацією.
Опис про переваги та недоліки системи Docker.[/INST]
```
<details>
<summary>Show me the response</summary>
```
```
</details>

29
config.json Normal file
View File

@@ -0,0 +1,29 @@
{
"architectures": [
"MixtralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 32768,
"model_type": "mixtral",
"num_attention_heads": 32,
"num_experts_per_tok": 2,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"num_local_experts": 4,
"output_router_logits": false,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"router_aux_loss_coef": 0.001,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.37.2",
"use_cache": false,
"vocab_size": 32000
}

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:0607bd8a873a44b1230c3ca9e705a35efd5a6a97e00989198846dcb5fb46fed0
size 9919666872

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1c94bbfcfdb8edba4d84fecd7d1890ff704d7657be1f536ed9776f5afc948b58
size 9898670488

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:1d4a3a0eb78d8b1539f28e4ee24687a40182c8985a8e590c2eb8bfdb8dd78662
size 9982573720

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:7bf033340d46670ed95d0602be1572755b59c5b36b2902b0c368ae1a8389bf0f
size 9982573720

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:3c8dc1128fb3edf6d7dc6e48892664f55c7a1ee6a9ae103b90a621d2ca2123e6
size 8523972544

File diff suppressed because one or more lines are too long

24
special_tokens_map.json Normal file
View File

@@ -0,0 +1,24 @@
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "<s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}

91129
tokenizer.json Normal file

File diff suppressed because it is too large Load Diff

BIN
tokenizer.model (Stored with Git LFS) Normal file

Binary file not shown.

46
tokenizer_config.json Normal file
View File

@@ -0,0 +1,46 @@
{
"add_bos_token": true,
"add_eos_token": false,
"add_prefix_space": true,
"added_tokens_decoder": {
"0": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"1": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
},
"2": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false,
"special": true
}
},
"additional_special_tokens": [],
"bos_token": "<s>",
"chat_template": "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token}}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}",
"clean_up_tokenization_spaces": false,
"eos_token": "</s>",
"legacy": true,
"model_max_length": 32768,
"pad_token": "<s>",
"padding_side": "left",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false,
"use_fast": true
}