47 lines
1.4 KiB
Markdown
47 lines
1.4 KiB
Markdown
|
|
---
|
||
|
|
library_name: transformers
|
||
|
|
license: apache-2.0
|
||
|
|
tags:
|
||
|
|
- merge
|
||
|
|
- mergekit
|
||
|
|
- mistral
|
||
|
|
- nemo
|
||
|
|
- model_stock
|
||
|
|
base_model:
|
||
|
|
- mistralai/Mistral-Nemo-Instruct-2407
|
||
|
|
- LatitudeGames/Muse-12B
|
||
|
|
- allura-org/Tlacuilo-12B
|
||
|
|
---
|
||
|
|
|
||
|
|
# 🐈 Musecuilo 12B Model_Stock
|
||
|
|
|
||
|
|

|
||
|
|
|
||
|
|
> [!NOTE]
|
||
|
|
> <span style="color:red; font-weight:bold">Note:</span> Use **Mistral Tekken** (recommended) or **ChatML** chat template for best results. The model has some refusals but can be jailbroken or ablated as needed.
|
||
|
|
>
|
||
|
|
|
||
|
|
This model was merged using the [`model_stock`](https://arxiv.org/abs/2403.19522) merge method.
|
||
|
|
|
||
|
|
Musecuilo is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
|
||
|
|
* [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
|
||
|
|
* [LatitudeGames/Muse-12B](https://huggingface.co/LatitudeGames/Muse-12B)
|
||
|
|
* [allura-org/Tlacuilo-12B](https://huggingface.co/allura-org/Tlacuilo-12BS)
|
||
|
|
|
||
|
|
## 🧩 Configuration
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
architecture: MistralForCausalLM
|
||
|
|
base_model: B:/12B/mistralai--Mistral-Nemo-Instruct-2407
|
||
|
|
models:
|
||
|
|
- model: B:/12B/allura-org--Tlacuilo-12B
|
||
|
|
- model: B:/12B/LatitudeGames--Muse-12B
|
||
|
|
merge_method: model_stock
|
||
|
|
parameters:
|
||
|
|
filter_wise: true
|
||
|
|
dtype: float32
|
||
|
|
out_dtype: bfloat16
|
||
|
|
tokenizer:
|
||
|
|
source: B:/12B/LatitudeGames--Muse-12B
|
||
|
|
name: Musecuilo-12B-Model_Stock
|
||
|
|
```
|