初始化项目,由ModelHub XC社区提供模型
Model: uukuguy/speechless-mistral-six-in-one-7b Source: Original Platform
This commit is contained in:
150
README.md
Normal file
150
README.md
Normal file
@@ -0,0 +1,150 @@
|
||||
---
|
||||
language:
|
||||
- en
|
||||
library_name: transformers
|
||||
pipeline_tag: text-generation
|
||||
datasets:
|
||||
- jondurbin/airoboros-2.2.1
|
||||
- Open-Orca/OpenOrca
|
||||
- garage-bAInd/Open-Platypus
|
||||
- ehartford/samantha-data
|
||||
- CollectiveCognition/chats-data-2023-09-27
|
||||
- stingning/ultrachat
|
||||
tags:
|
||||
- llama-2
|
||||
- code
|
||||
license: llama2
|
||||
model-index:
|
||||
- name: SpeechlessCoder
|
||||
results:
|
||||
- task:
|
||||
type: text-generation
|
||||
dataset:
|
||||
type: openai_humaneval
|
||||
name: HumanEval
|
||||
metrics:
|
||||
- name: pass@1
|
||||
type: pass@1
|
||||
value: 0.0
|
||||
verified: false
|
||||
---
|
||||
|
||||
<p><h1> speechless-mistral-six-in-one-7b </h1></p>
|
||||
|
||||
This model is a merge of 6 SOTA Mistral-7B based models:
|
||||
- ehartford/dolphin-2.1-mistral-7b
|
||||
- Open-Orca/Mistral-7B-OpenOrca
|
||||
- bhenrym14/mistral-7b-platypus-fp16
|
||||
- ehartford/samantha-1.2-mistral-7b
|
||||
- iteknium/CollectiveCognition-v1.1-Mistral-7B
|
||||
- HuggingFaceH4/zephyr-7b-alpha
|
||||
|
||||
|
||||
[Model benchmark](https://huggingface.co/uukuguy/speechless-mistral-six-in-one-7b/discussions/1) by [sethuiyer](https://huggingface.co/sethuiyer) . Thanks a lot.
|
||||
> I tested the Q6_0 version of the model against LLaMa2 70B chat and here are the results - Scoring as per ChatGPT and Bard's average. Named this model Mixtral. Questions taken from MT-Benchmark.
|
||||
>
|
||||
> On a scale of 0 to 100, I would rate Mixtral at 98. Here's why:
|
||||
>
|
||||
> - Intellect (100/100) - Mixtral has demonstrated immense intellectual abilities through its comprehensive knowledge and logical reasoning skills.
|
||||
> - Creativity (98/100) - In addition to being highly intelligent, Mixtral also displays impressive creative talents through its unique, nuanced responses.
|
||||
> - Adaptability (98/100) - Mixtral can converse flexibly on a wide variety of topics, adapting smoothly based on contextual cues.
|
||||
> - Communication (97/100) - Mixtral communicates clearly and eloquently through written language, thoroughly answering questions.
|
||||
> - Problem-Solving (98/100) - Questions are addressed comprehensively, considering multiple perspectives to arrive at well-thought solutions.
|
||||
> - Personability (97/100) - Responses are warm, inviting and non-threatening due to Mixtral's kindness and thoughtfulness.
|
||||
>
|
||||
> Overall, a very capable model for it's size.
|
||||
|
||||
Code: https://github.com/uukuguy/speechless
|
||||
|
||||
## HumanEval
|
||||
|
||||
| Metric | Value |
|
||||
| --- | --- |
|
||||
| humaneval-python | |
|
||||
|
||||
[Big Code Models Leaderboard](https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard)
|
||||
|
||||
CodeLlama-34B-Python: 53.29
|
||||
|
||||
CodeLlama-34B-Instruct: 50.79
|
||||
|
||||
CodeLlama-13B-Instruct: 50.6
|
||||
|
||||
CodeLlama-34B: 45.11
|
||||
|
||||
CodeLlama-13B-Python: 42.89
|
||||
|
||||
CodeLlama-13B: 35.07
|
||||
|
||||
Mistral-7B-v0.1: 30.488
|
||||
|
||||
## LM-Evaluation-Harness
|
||||
|
||||
[Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
||||
|
||||
| Metric | Value |
|
||||
| --- | --- |
|
||||
| ARC | 62.97 |
|
||||
| HellaSwag | 84.6|
|
||||
| MMLU | 63.29 |
|
||||
| TruthfulQA | 57.77 |
|
||||
| Winogrande | 77.51 |
|
||||
| GSM8K | 18.42 |
|
||||
| DROP | 9.13 |
|
||||
| Average | 53.38 |
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Model Card for Mistral-7B-v0.1
|
||||
|
||||
The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
|
||||
Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
|
||||
|
||||
For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
|
||||
|
||||
## Model Architecture
|
||||
|
||||
Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
|
||||
- Grouped-Query Attention
|
||||
- Sliding-Window Attention
|
||||
- Byte-fallback BPE tokenizer
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- If you see the following error:
|
||||
``
|
||||
KeyError: 'mistral'
|
||||
``
|
||||
- Or:
|
||||
``
|
||||
NotImplementedError: Cannot copy out of meta tensor; no data!
|
||||
``
|
||||
|
||||
Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
|
||||
|
||||
## Notice
|
||||
|
||||
Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
|
||||
|
||||
## The Mistral AI Team
|
||||
|
||||
Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.`
|
||||
|
||||
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
||||
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_uukuguy__speechless-mistral-six-in-one-7b)
|
||||
|
||||
| Metric | Value |
|
||||
|-----------------------|---------------------------|
|
||||
| Avg. | 53.38 |
|
||||
| ARC (25-shot) | 62.97 |
|
||||
| HellaSwag (10-shot) | 84.6 |
|
||||
| MMLU (5-shot) | 63.29 |
|
||||
| TruthfulQA (0-shot) | 57.77 |
|
||||
| Winogrande (5-shot) | 77.51 |
|
||||
| GSM8K (5-shot) | 18.42 |
|
||||
| DROP (3-shot) | 9.13 |
|
||||
Reference in New Issue
Block a user