初始化项目,由ModelHub XC社区提供模型
Model: oobabooga/CodeBooga-34B-v0.1 Source: Original Platform
This commit is contained in:
94
README.md
Normal file
94
README.md
Normal file
@@ -0,0 +1,94 @@
|
||||
---
|
||||
license: llama2
|
||||
---
|
||||
|
||||
# CodeBooga-34B-v0.1
|
||||
|
||||
This is a merge between the following two models:
|
||||
|
||||
1) [Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2)
|
||||
2) [WizardCoder-Python-34B-V1.0](https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0)
|
||||
|
||||
It was created with the [BlockMerge Gradient script](https://github.com/Gryphe/BlockMerge_Gradient), the same one that was used to create [MythoMax-L2-13b](https://huggingface.co/Gryphe/MythoMax-L2-13b), and with the same settings. The following YAML was used:
|
||||
|
||||
```yaml
|
||||
model_path1: "Phind_Phind-CodeLlama-34B-v2_safetensors"
|
||||
model_path2: "WizardLM_WizardCoder-Python-34B-V1.0_safetensors"
|
||||
output_model_path: "CodeBooga-34B-v0.1"
|
||||
operations:
|
||||
- operation: lm_head # Single tensor
|
||||
filter: "lm_head"
|
||||
gradient_values: [0.75]
|
||||
- operation: embed_tokens # Single tensor
|
||||
filter: "embed_tokens"
|
||||
gradient_values: [0.75]
|
||||
- operation: self_attn
|
||||
filter: "self_attn"
|
||||
gradient_values: [0.75, 0.25]
|
||||
- operation: mlp
|
||||
filter: "mlp"
|
||||
gradient_values: [0.25, 0.75]
|
||||
- operation: layernorm
|
||||
filter: "layernorm"
|
||||
gradient_values: [0.5, 0.5]
|
||||
- operation: modelnorm # Single tensor
|
||||
filter: "model.norm"
|
||||
gradient_values: [0.75]
|
||||
```
|
||||
|
||||
## Prompt format
|
||||
|
||||
Both base models use the Alpaca format, so it should be used for this one as well.
|
||||
|
||||
```
|
||||
Below is an instruction that describes a task. Write a response that appropriately completes the request.
|
||||
|
||||
### Instruction:
|
||||
Your instruction
|
||||
|
||||
### Response:
|
||||
Bot reply
|
||||
|
||||
### Instruction:
|
||||
Another instruction
|
||||
|
||||
### Response:
|
||||
Bot reply
|
||||
```
|
||||
|
||||
## Evaluation
|
||||
|
||||
(This is not very scientific, so bear with me.)
|
||||
|
||||
I made a quick experiment where I asked a set of 3 Python and 3 Javascript questions (real world, difficult questions with nuance) to the following models:
|
||||
|
||||
1) This one
|
||||
2) A second variant generated with `model_path1` and `model_path2` swapped in the YAML above, which I called CodeBooga-Reversed-34B-v0.1
|
||||
3) WizardCoder-Python-34B-V1.0
|
||||
4) Phind-CodeLlama-34B-v2
|
||||
|
||||
Specifically, I used 4.250b EXL2 quantizations of each. I then sorted the responses for each question by quality, and attributed the following scores:
|
||||
|
||||
* 4th place: 0
|
||||
* 3rd place: 1
|
||||
* 2nd place: 2
|
||||
* 1st place: 4
|
||||
|
||||
The resulting cumulative scores were:
|
||||
|
||||
* CodeBooga-34B-v0.1: 22
|
||||
* WizardCoder-Python-34B-V1.0: 12
|
||||
* Phind-CodeLlama-34B-v2: 7
|
||||
* CodeBooga-Reversed-34B-v0.1: 1
|
||||
|
||||
CodeBooga-34B-v0.1 performed very well, while its variant performed poorly, so I uploaded the former but not the latter.
|
||||
|
||||
## Quantized versions
|
||||
|
||||
### GGUF
|
||||
|
||||
TheBloke has kindly provided GGUF quantizations for llama.cpp:
|
||||
|
||||
https://huggingface.co/TheBloke/CodeBooga-34B-v0.1-GGUF
|
||||
|
||||
<a href="https://ko-fi.com/oobabooga"><img src="https://i.imgur.com/UJlEAYw.png"></a>
|
||||
Reference in New Issue
Block a user