初始化项目，由ModelHub XC社区提供模型

Model: oobabooga/CodeBooga-34B-v0.1 Source: Original Platform
2026-05-14 06:04:53 +08:00
commit 53f35705a4
18 changed files with 94076 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,94 @@
+---
+license: llama2
+---
+
+# CodeBooga-34B-v0.1
+
+This is a merge between the following two models:
+
+1) [Phind-CodeLlama-34B-v2](https://huggingface.co/Phind/Phind-CodeLlama-34B-v2)
+2) [WizardCoder-Python-34B-V1.0](https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0)
+
+It was created with the [BlockMerge Gradient script](https://github.com/Gryphe/BlockMerge_Gradient), the same one that was used to create [MythoMax-L2-13b](https://huggingface.co/Gryphe/MythoMax-L2-13b), and with the same settings. The following YAML was used:
+
+```yaml
+model_path1: "Phind_Phind-CodeLlama-34B-v2_safetensors"
+model_path2: "WizardLM_WizardCoder-Python-34B-V1.0_safetensors"
+output_model_path: "CodeBooga-34B-v0.1"
+operations:
+  - operation: lm_head # Single tensor
+    filter: "lm_head"
+    gradient_values: [0.75]
+  - operation: embed_tokens # Single tensor
+    filter: "embed_tokens"
+    gradient_values: [0.75]
+  - operation: self_attn
+    filter: "self_attn"
+    gradient_values: [0.75, 0.25]
+  - operation: mlp
+    filter: "mlp"
+    gradient_values: [0.25, 0.75]
+  - operation: layernorm
+    filter: "layernorm"
+    gradient_values: [0.5, 0.5]
+  - operation: modelnorm # Single tensor
+    filter: "model.norm"
+    gradient_values: [0.75]
+```
+
+## Prompt format
+
+Both base models use the Alpaca format, so it should be used for this one as well.
+
+```
+Below is an instruction that describes a task. Write a response that appropriately completes the request.
+
+### Instruction:
+Your instruction
+
+### Response:
+Bot reply
+
+### Instruction:
+Another instruction
+
+### Response:
+Bot reply
+```
+
+## Evaluation
+
+(This is not very scientific, so bear with me.)
+
+I made a quick experiment where I asked a set of 3 Python and 3 Javascript questions (real world, difficult questions with nuance) to the following models:
+
+1) This one
+2) A second variant generated with `model_path1` and `model_path2` swapped in the YAML above, which I called CodeBooga-Reversed-34B-v0.1
+3) WizardCoder-Python-34B-V1.0
+4) Phind-CodeLlama-34B-v2
+
+Specifically, I used 4.250b EXL2 quantizations of each. I then sorted the responses for each question by quality, and attributed the following scores:
+
+* 4th place: 0
+* 3rd place: 1
+* 2nd place: 2
+* 1st place: 4
+
+The resulting cumulative scores were:
+
+* CodeBooga-34B-v0.1: 22
+* WizardCoder-Python-34B-V1.0: 12
+* Phind-CodeLlama-34B-v2: 7
+* CodeBooga-Reversed-34B-v0.1: 1
+
+CodeBooga-34B-v0.1 performed very well, while its variant performed poorly, so I uploaded the former but not the latter.
+
+## Quantized versions
+
+### GGUF
+
+TheBloke has kindly provided GGUF quantizations for llama.cpp:
+
+https://huggingface.co/TheBloke/CodeBooga-34B-v0.1-GGUF
+
+<a href="https://ko-fi.com/oobabooga"><img src="https://i.imgur.com/UJlEAYw.png"></a>