87 lines
2.6 KiB
Markdown
87 lines
2.6 KiB
Markdown
---
|
|
inference: false
|
|
language: en
|
|
license: llama2
|
|
model_type: llama
|
|
datasets:
|
|
- mlabonne/CodeLlama-2-20k
|
|
pipeline_tag: text-generation
|
|
base_model:
|
|
- davzoku/cria-llama2-7b-v1.3
|
|
library_name: transformers
|
|
tags:
|
|
- mergekit
|
|
- merge
|
|
- llama-2
|
|
|
|
---
|
|
# FrankenCRIA v1.3-m.2
|
|
|
|
## What is FrankenCRIA?
|
|
|
|
<p align="center">
|
|
<img src="https://github.com/davzoku/cria/blob/main/assets/frankencria-icon-512x512.png?raw=true" width="300" height="300" alt="FrankenCRIA Logo"> <br>
|
|
<i>This is a frankenmerge of <a href="https://huggingface.co/davzoku/cria-llama2-7b-v1.3">davzoku/cria-llama2-7b-v1.3</a>.</i>
|
|
</p>
|
|
|
|
The configuration is the same as [vilm/vinallama-12.5b-chat-DUS](https://huggingface.co/vilm/vinallama-12.5b-chat-DUS).
|
|
|
|
|
|
Please be aware that this model is highly experimental, and no further training has been conducted following the merge.
|
|
Therefore, the model performance may not meet expectations, as described in the [SOLAR paper](https://arxiv.org/abs/2312.15166)
|
|
|
|
|
|
## 📦 FrankenCRIA Model Release
|
|
|
|
FrankenCRIA v1.3 comes with several variants.
|
|
|
|
- [davzoku/frankencria-llama2-11b-v1.3-m.1](https://huggingface.co/davzoku/frankencria-llama2-11b-v1.3-m.1): 11B FrankenMerge inspired by [Undi95/Mistral-11B-v0.1](https://huggingface.co/Undi95/Mistral-11B-v0.1)
|
|
- [davzoku/frankencria-llama2-12.5b-v1.3-m.2](https://huggingface.co/davzoku/frankencria-llama2-12.5b-v1.3-m.2): 12.5B interleaving FrankenMerge inspired by [vilm/vinallama-12.5b-chat-DUS](https://huggingface.co/vilm/vinallama-12.5b-chat-DUS)
|
|
|
|
|
|
## 🧩 Merge Details
|
|
### Merge Method
|
|
|
|
This model was merged using the passthrough merge method.
|
|
|
|
### Models Merged
|
|
|
|
The following models were included in the merge:
|
|
* [davzoku/cria-llama2-7b-v1.3](https://huggingface.co/davzoku/cria-llama2-7b-v1.3)
|
|
|
|
### Configuration
|
|
|
|
The following YAML configuration was used to produce this model:
|
|
|
|
```yaml
|
|
# https://huggingface.co/vilm/vinallama-12.5b-chat-DUS
|
|
slices:
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [0, 16]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [8, 16]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [8, 16]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [16, 24]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [16, 24]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [24, 28]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [24, 28]
|
|
- sources:
|
|
- model: davzoku/cria-llama2-7b-v1.3
|
|
layer_range: [28, 32]
|
|
merge_method: passthrough
|
|
dtype: bfloat16
|
|
|
|
```
|