初始化项目,由ModelHub XC社区提供模型
Model: NousResearch/Redmond-Hermes-Coder Source: Original Platform
This commit is contained in:
109
README.md
Normal file
109
README.md
Normal file
@@ -0,0 +1,109 @@
|
||||
---
|
||||
license: gpl
|
||||
language:
|
||||
- en
|
||||
tags:
|
||||
- starcoder
|
||||
- wizardcoder
|
||||
- code
|
||||
- self-instruct
|
||||
- distillation
|
||||
---
|
||||
|
||||
# Model Card: Redmond-Hermes-Coder 15B
|
||||
|
||||
## Model Description
|
||||
|
||||
Redmond-Hermes-Coder 15B is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.
|
||||
|
||||
This model was trained with a WizardCoder base, which itself uses a StarCoder base model.
|
||||
|
||||
The model is truly great at code, but, it does come with a tradeoff though. While far better at code than the original Nous-Hermes built on Llama, it is worse than WizardCoder at pure code benchmarks, like HumanEval.
|
||||
|
||||
It comes in at 39% on HumanEval, with WizardCoder at 57%. This is a preliminary experiment, and we are exploring improvements now.
|
||||
|
||||
However, it does seem better at non-code than WizardCoder on a variety of things, including writing tasks.
|
||||
|
||||
## Model Training
|
||||
|
||||
The model was trained almost entirely on synthetic GPT-4 outputs. This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), CodeAlpaca, Evol_Instruct Uncensored, GPT4-LLM, and Unnatural Instructions.
|
||||
|
||||
Additional data inputs came from Camel-AI's Biology/Physics/Chemistry and Math Datasets, Airoboros' (v1) GPT-4 Dataset, and more from CodeAlpaca. The total volume of data encompassed over 300,000 instructions.
|
||||
|
||||
## Collaborators
|
||||
The model fine-tuning and the datasets were a collaboration of efforts and resources from members of Nous Research, includingTeknium, Karan4D, Huemin Art, and Redmond AI's generous compute grants.
|
||||
|
||||
Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly.
|
||||
|
||||
Among the contributors of datasets, GPTeacher was made available by Teknium, Wizard LM by nlpxucan, and the Nous Research Instruct Dataset was provided by Karan4D and HueminArt.
|
||||
The GPT4-LLM and Unnatural Instructions were provided by Microsoft, Airoboros dataset by jondurbin, Camel-AI datasets are from Camel-AI, and CodeAlpaca dataset by Sahil 2801.
|
||||
If anyone was left out, please open a thread in the community tab.
|
||||
|
||||
## Prompt Format
|
||||
|
||||
The model follows the Alpaca prompt format:
|
||||
```
|
||||
### Instruction:
|
||||
|
||||
### Response:
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```
|
||||
### Instruction:
|
||||
|
||||
### Input:
|
||||
|
||||
### Response:
|
||||
```
|
||||
|
||||
## Resources for Applied Use Cases:
|
||||
For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord
|
||||
For an example of a roleplaying discord bot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot
|
||||
|
||||
## Future Plans
|
||||
The model is currently being uploaded in FP16 format, and there are plans to convert the model to GGML and GPTQ 4bit quantizations. The team is also working on a full benchmark, similar to what was done for GPT4-x-Vicuna. We will try to get in discussions to get the model included in the GPT4All.
|
||||
|
||||
## Benchmark Results
|
||||
```
|
||||
HumanEval: 39%
|
||||
| Task |Version| Metric |Value | |Stderr|
|
||||
|------------------------------------------------|------:|---------------------|-----:|---|-----:|
|
||||
|arc_challenge | 0|acc |0.2858|± |0.0132|
|
||||
| | |acc_norm |0.3148|± |0.0136|
|
||||
|arc_easy | 0|acc |0.5349|± |0.0102|
|
||||
| | |acc_norm |0.5097|± |0.0103|
|
||||
|bigbench_causal_judgement | 0|multiple_choice_grade|0.5158|± |0.0364|
|
||||
|bigbench_date_understanding | 0|multiple_choice_grade|0.5230|± |0.0260|
|
||||
|bigbench_disambiguation_qa | 0|multiple_choice_grade|0.3295|± |0.0293|
|
||||
|bigbench_geometric_shapes | 0|multiple_choice_grade|0.1003|± |0.0159|
|
||||
| | |exact_str_match |0.0000|± |0.0000|
|
||||
|bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|0.2260|± |0.0187|
|
||||
|bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|0.1957|± |0.0150|
|
||||
|bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|0.3733|± |0.0280|
|
||||
|bigbench_movie_recommendation | 0|multiple_choice_grade|0.3200|± |0.0209|
|
||||
|bigbench_navigate | 0|multiple_choice_grade|0.4830|± |0.0158|
|
||||
|bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|0.4150|± |0.0110|
|
||||
|bigbench_ruin_names | 0|multiple_choice_grade|0.2143|± |0.0194|
|
||||
|bigbench_salient_translation_error_detection | 0|multiple_choice_grade|0.2926|± |0.0144|
|
||||
|bigbench_snarks | 0|multiple_choice_grade|0.5249|± |0.0372|
|
||||
|bigbench_sports_understanding | 0|multiple_choice_grade|0.4817|± |0.0159|
|
||||
|bigbench_temporal_sequences | 0|multiple_choice_grade|0.2700|± |0.0140|
|
||||
|bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|0.1864|± |0.0110|
|
||||
|bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|0.1349|± |0.0082|
|
||||
|bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|0.3733|± |0.0280|
|
||||
|boolq | 1|acc |0.5498|± |0.0087|
|
||||
|hellaswag | 0|acc |0.3814|± |0.0048|
|
||||
| | |acc_norm |0.4677|± |0.0050|
|
||||
|openbookqa | 0|acc |0.1960|± |0.0178|
|
||||
| | |acc_norm |0.3100|± |0.0207|
|
||||
|piqa | 0|acc |0.6600|± |0.0111|
|
||||
| | |acc_norm |0.6610|± |0.0110|
|
||||
|winogrande | 0|acc |0.5343|± |0.0140|
|
||||
```
|
||||
|
||||
## Model Usage
|
||||
The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
|
||||
|
||||
Compute provided by our project sponsor Redmond AI, thank you!!
|
||||
Reference in New Issue
Block a user