59 lines
1.5 KiB
Markdown
59 lines
1.5 KiB
Markdown
---
|
|
license: cc-by-nc-4.0
|
|
tags:
|
|
- not-for-all-audiences
|
|
- nsfw
|
|
---
|
|
|
|
## Borealis
|
|
|
|

|
|
|
|
Borealis-10.7B-DPO is a 10.7B model made of 48 Mistral 7B layers, finetuned for +70h on 2xA6000 on a big RP and Conversational dataset with llama2 configuration of Axolotl, like SOLAR.
|
|
|
|
This variant had a DPO train on top of it.
|
|
|
|
<!-- description start -->
|
|
## Description
|
|
|
|
This repo contains fp16 files of Borealis-10.7B-DPO, a conversational model.
|
|
|
|
The goal of this model isn't to break all benchmark, but to have a better RP/ERP/Conversational model.
|
|
|
|
It was trained on multiple basic dataset to make it intelligent, but majority of the dataset was basic conversations.
|
|
|
|
<!-- description end -->
|
|
<!-- description start -->
|
|
## Dataset used
|
|
|
|
- NobodyExistsOnTheInternet/ToxicQAFinal
|
|
- teknium/openhermes
|
|
- unalignment/spicy-3.1
|
|
- Doctor-Shotgun/no-robots-sharegpt
|
|
- Undi95/toxic-dpo-v0.1-sharegpt
|
|
- Aesir [1], [2], [3-SFW], [3-NSFW]
|
|
- lemonilia/LimaRP
|
|
- Squish42/bluemoon-fandom-1-1-rp-cleaned
|
|
- Undi95/ConversationChronicles-sharegpt-SHARDED (2 sets, modified)
|
|
|
|
## DPO Dataset used
|
|
|
|
- Intel/orca_dpo_pairs
|
|
- NobodyExistsOnTheInternet/ToxicDPOqa
|
|
- Undi95/toxic-dpo-v0.1-NoWarning
|
|
|
|
<!-- description end -->
|
|
<!-- prompt-template start -->
|
|
## Prompt format: NsChatml
|
|
```
|
|
<|im_system|>
|
|
{sysprompt}<|im_end|>
|
|
<|im_user|>
|
|
{input}<|im_end|>
|
|
<|im_bot|>
|
|
{output}<|im_end|>
|
|
```
|
|
|
|
## Others
|
|
|
|
If you want to support me, you can [here](https://ko-fi.com/undiai). |