230 lines
7.4 KiB
Markdown
230 lines
7.4 KiB
Markdown
---
|
|
license: other
|
|
license_name: llama-3
|
|
license_link: https://llama.meta.com/llama3/license/
|
|
tags:
|
|
- text-generation-inference
|
|
- transformers
|
|
- unsloth
|
|
- llama
|
|
datasets:
|
|
- Replete-AI/code_bagel_hermes-2.5
|
|
- Replete-AI/code_bagel
|
|
- Replete-AI/OpenHermes-2.5-Uncensored
|
|
- teknium/OpenHermes-2.5
|
|
- layoric/tiny-codes-alpaca
|
|
- glaiveai/glaive-code-assistant-v3
|
|
- ajibawa-2023/Code-290k-ShareGPT
|
|
- TIGER-Lab/MathInstruct
|
|
- chargoddard/commitpack-ft-instruct-rated
|
|
- iamturun/code_instructions_120k_alpaca
|
|
- ise-uiuc/Magicoder-Evol-Instruct-110K
|
|
- cognitivecomputations/dolphin-coder
|
|
- nickrosh/Evol-Instruct-Code-80k-v1
|
|
- coseal/CodeUltraFeedback_binarized
|
|
- glaiveai/glaive-function-calling-v2
|
|
- CyberNative/Code_Vulnerability_Security_DPO
|
|
- jondurbin/airoboros-2.2
|
|
- camel-ai
|
|
- lmsys/lmsys-chat-1m
|
|
- CollectiveCognition/chats-data-2023-09-22
|
|
- CoT-Alpaca-GPT4
|
|
- WizardLM/WizardLM_evol_instruct_70k
|
|
- WizardLM/WizardLM_evol_instruct_V2_196k
|
|
- teknium/GPT4-LLM-Cleaned
|
|
- GPTeacher
|
|
- OpenGPT
|
|
- meta-math/MetaMathQA
|
|
- Open-Orca/SlimOrca
|
|
- garage-bAInd/Open-Platypus
|
|
- anon8231489123/ShareGPT_Vicuna_unfiltered
|
|
- Unnatural-Instructions-GPT4
|
|
model-index:
|
|
- name: Replete-Coder-llama3-8b
|
|
results:
|
|
- task:
|
|
name: HumanEval
|
|
type: text-generation
|
|
dataset:
|
|
type: openai_humaneval
|
|
name: HumanEval
|
|
metrics:
|
|
- name: pass@1
|
|
type: pass@1
|
|
value: .64683835842678326
|
|
verified: True
|
|
- task:
|
|
name: AI2 Reasoning Challenge
|
|
type: text-generation
|
|
dataset:
|
|
name: AI2 Reasoning Challenge (25-Shot)
|
|
type: ai2_arc
|
|
config: ARC-Challenge
|
|
split: test
|
|
args:
|
|
num_few_shot: 25
|
|
metrics:
|
|
- type: accuracy
|
|
value:
|
|
name: normalized accuracy
|
|
source:
|
|
url: https://www.placeholderurl.com
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
name: Text Generation
|
|
type: text-generation
|
|
dataset:
|
|
name: HellaSwag (10-Shot)
|
|
type: hellaswag
|
|
split: validation
|
|
args:
|
|
num_few_shot: 10
|
|
metrics:
|
|
- type: accuracy
|
|
value:
|
|
name: normalized accuracy
|
|
source:
|
|
url: https://www.placeholderurl.com
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
name: Text Generation
|
|
type: text-generation
|
|
dataset:
|
|
name: MMLU (5-Shot)
|
|
type: cais/mmlu
|
|
config: all
|
|
split: test
|
|
args:
|
|
num_few_shot: 5
|
|
metrics:
|
|
- type: accuracy
|
|
value:
|
|
name: accuracy
|
|
source:
|
|
url: https://www.placeholderurl.com
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
name: Text Generation
|
|
type: text-generation
|
|
dataset:
|
|
name: TruthfulQA (0-shot)
|
|
type: truthful_qa
|
|
config: multiple_choice
|
|
split: validation
|
|
args:
|
|
num_few_shot: 0
|
|
metrics:
|
|
- type: multiple_choice_accuracy
|
|
value:
|
|
source:
|
|
url: https://www.placeholderurl.com
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
name: Text Generation
|
|
type: text-generation
|
|
dataset:
|
|
name: Winogrande (5-shot)
|
|
type: winogrande
|
|
config: winogrande_xl
|
|
split: validation
|
|
args:
|
|
num_few_shot: 5
|
|
metrics:
|
|
- type: accuracy
|
|
value:
|
|
name: accuracy
|
|
source:
|
|
url: https://www.placeholderurl.com
|
|
name: Open LLM Leaderboard
|
|
- task:
|
|
name: Text Generation
|
|
type: text-generation
|
|
dataset:
|
|
name: GSM8k (5-shot)
|
|
type: gsm8k
|
|
config: main
|
|
split: test
|
|
args:
|
|
num_few_shot: 5
|
|
metrics:
|
|
- type: accuracy
|
|
value:
|
|
name: accuracy
|
|
source:
|
|
url: https://www.placeholderurl.com
|
|
name: Open LLM Leaderboard
|
|
---
|
|
# Replete-Coder-llama3-8b
|
|
Finetuned by: Rombodawg
|
|
### More than just a coding model!
|
|
Although Replete-Coder has amazing coding capabilities, its trained on vaste amount of non-coding data, fully cleaned and uncensored. Dont just use it for coding, use it for all your needs! We are truly trying to make the GPT killer!
|
|

|
|
|
|
Thank you to TensorDock for sponsoring Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b
|
|
you can check out their website for cloud compute rental below.
|
|
- https://tensordock.com
|
|
__________________________________________________________________________________________________
|
|
Replete-Coder-llama3-8b is a general purpose model that is specially trained in coding in over 100 coding languages. The data used to train the model contains 25% non-code instruction data and 75% coding instruction data totaling up to 3.9 million lines, roughly 1 billion tokens, or 7.27gb of instruct data. The data used to train this model was 100% uncensored, then fully deduplicated, before training happened.
|
|
|
|
The Replete-Coder models (including Replete-Coder-llama3-8b and Replete-Coder-Qwen2-1.5b) feature the following:
|
|
|
|
- Advanced coding capabilities in over 100 coding languages
|
|
- Advanced code translation (between languages)
|
|
- Security and vulnerability prevention related coding capabilities
|
|
- General purpose use
|
|
- Uncensored use
|
|
- Function calling
|
|
- Advanced math use
|
|
- Use on low end (8b) and mobile (1.5b) platforms
|
|
|
|
Notice: Replete-Coder series of models are fine-tuned on a context window of 8192 tokens. Performance past this context window is not guaranteed.
|
|
|
|

|
|
__________________________________________________________________________________________________
|
|
You can find the 25% non-coding instruction below:
|
|
|
|
- https://huggingface.co/datasets/Replete-AI/OpenHermes-2.5-Uncensored
|
|
|
|
And the 75% coding specific instruction data below:
|
|
|
|
- https://huggingface.co/datasets/Replete-AI/code_bagel
|
|
|
|
These two datasets were combined to create the final dataset for training, which is linked below:
|
|
|
|
- https://huggingface.co/datasets/Replete-AI/code_bagel_hermes-2.5
|
|
__________________________________________________________________________________________________
|
|
## Prompt Template: Custom Alpaca
|
|
```
|
|
### System:
|
|
{}
|
|
|
|
### Instruction:
|
|
{}
|
|
|
|
### Response:
|
|
{}
|
|
```
|
|
Note: The system prompt varies in training data, but the most commonly used one is:
|
|
```
|
|
Below is an instruction that describes a task, Write a response that appropriately completes the request.
|
|
```
|
|
End token:
|
|
```
|
|
<|endoftext|>
|
|
```
|
|
__________________________________________________________________________________________________
|
|
Thank you to the community for your contributions to the Replete-AI/code_bagel_hermes-2.5 dataset. Without the participation of so many members making their datasets free and open source for any to use, this amazing AI model wouldn't be possible.
|
|
|
|
Extra special thanks to Teknium for the Open-Hermes-2.5 dataset and jondurbin for the bagel dataset and the naming idea for the code_bagel series of datasets. You can find both of their huggingface accounts linked below:
|
|
|
|
- https://huggingface.co/teknium
|
|
- https://huggingface.co/jondurbin
|
|
|
|
Another special thanks to unsloth for being the main method of training for Replete-Coder. Bellow you can find their github, as well as the special Replete-Ai secret sause (Unsloth + Qlora + Galore) colab code document that was used to train this model.
|
|
|
|
- https://github.com/unslothai/unsloth
|
|
- https://colab.research.google.com/drive/1VAaxMQJN9-78WLsPU0GWg5tEkasXoTP9?usp=sharing
|
|
__________________________________________________________________________________________________
|
|
## Join the Replete-Ai discord! We are a great and Loving community!
|
|
|
|
- https://discord.gg/ZZbnsmVnjD |