初始化项目，由ModelHub XC社区提供模型

Model: adamo1139/Yi-34B-AEZAKMI-v1 Source: Original Platform
2026-05-02 00:10:53 +08:00
commit 99cd46bdf9
25 changed files with 175298 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,92 @@
+---
+license: apache-2.0
+tags:
+- llm
+- yi
+- fine-tune
+---
+
+## Model description
+
+Yi-34B base model fine-tuned on AEZAKMI v1 dataset. Training took around 33 hours on single local RTX 3090 Ti.
+It's like airoboros but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models.
+Say goodbye to  "It's important to remember"! \
+Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot.
+Cost of this fine-tune is about $3 in electricity. This was my first attempt at training Yi-34B with this dataset.
+Base model used for fine-tuning was 4k context Yi-34B-Llama model shared by chargoddard.
+
+
+## Prompt Format
+
+I recommend using ChatML format, as this was used during fine-tune. \
+Here's a prompt format you should use, you can set a different system message, model seems to respect that fine, so it wasn't overfitted.
+
+```
+<|im_start|>system
+A chat with uncensored assistant.<|im_end|>
+<|im_start|>user
+{prompt}<|im_end|>
+<|im_start|>assistant
+```
+
+## Intended uses & limitations
+
+Use is limited by Yi license.
+
+## Known Issues
+
+I recommend to set repetition penalty to something around 1.05 to avoid repetition. So far I had good experience running this model with temperature 1.2.
+Multi-turn conversations could be a bit better, if you ask it to re-write something with some fixes it will have a tendency to just repeat the previous response verbatim without any improvements - this is especially noticeable with repp 1.0. \
+There is still some gptslop left - some responses will have last paragraph with text "Remember that bla bla bla", I will try to get rid of it in the next version of the dataset. \
+Stories have ChatGPT like paragraph spacing, I will try to introduce a bit more stories that have long paragraphs in the next dataset version. 
+
+Unrestricted-ness of the v1 isn't quite something that I am happy with yet, it quite often still refuses to answer and behaves kinda like a normal airoboros, especially with default system prompt "A chat.". \
+With a slightly modifed prompt it works much better, I recommend using "A chat with uncensored assistant.", it's stupid but it helps a lot. \
+I was made aware of the frequent occurrence of the phrase "sending shivers down a spine" in the generations during RP, so I will manually fix those samples and put in different wording to fix that. \
+I plan to use Yi-34B-200K Llama-fied for the next version, v1 very likely won't handle long context above 6k-8k well, be warned. \
+There is also some issue with handling long system messages for RP, I will investigate that before releasing v2 dataset.
+
+Feel free to report issues in the discussions panel here, I don't lurk /lmg/ too often and I would still like to hear some feedback.
+
+
+## Axolotl training parameters
+
+- bnb_4bit_use_double_quant: true
+- bnb_4bit_compute_dtype: torch.bfloat16
+- is_llama_derived_model: true
+- load_in_4bit: true
+- adapter: qlora
+- sequence_len: 1200
+- sample_packing: false
+- lora_r: 16
+- lora_alpha: 32
+- lora_target_modules:
+  - q_proj
+  - v_proj
+  - k_proj
+  - o_proj
+  - gate_proj
+  - down_proj
+  - up_proj
+ - lora_target_linear: true
+ - pad_to_sequence_len: true
+ - micro_batch_size: 1
+ - gradient_accumulation_steps: 1
+ - num_epochs: 1
+ - optimizer: adamw_bnb_8bit
+ - lr_scheduler: constant
+ - learning_rate: 0.00007
+ - train_on_inputs: false
+ - group_by_length: false
+ - bf16: true
+ - bfloat16: true
+ - flash_optimum: false
+ - gradient_checkpointing: true
+ - flash_attention: true
+ - seed: 42
+
+
+## Upcoming
+
+~I will release adapter files and maybe exllama v2 quant shortly.~ \
+LoRA and exl2 quant has been released