初始化项目，由ModelHub XC社区提供模型

Model: flammenai/FlameDesigner-Qwen2.5-3B-v1 Source: Original Platform
2026-06-16 03:42:16 +08:00
commit 14c704d4a8
8 changed files with 255 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,46 @@
+---
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- merlina
+- grimoire
+- text-generation
+- sft
+datasets:
+- flammenai/flame-kindling-v1
+base_model:
+- Qwen/Qwen2.5-3B-Instruct
+---
+
+# FlameDesigner-Qwen2.5-3B-v1
+
+## Training Configuration
+
+| Parameter | Value |
+|-----------|-------|
+| Training Mode | SFT |
+| Base Model | `Qwen/Qwen2.5-3B-Instruct` |
+| Learning Rate | 0.0001 |
+| Epochs | 2 |
+| Batch Size | 1 |
+| Gradient Accumulation | 8 |
+| Effective Batch Size | 8 |
+| Max Sequence Length | 2048 |
+| Optimizer | paged_adamw_8bit |
+| LR Scheduler | cosine |
+| Warmup Ratio | 0.05 |
+| Weight Decay | 0.01 |
+| Max Grad Norm | 0.5 |
+| Seed | 42 |
+| LoRA Rank (r) | 128 |
+| LoRA Alpha | 128 |
+| LoRA Dropout | 0.05 |
+| Target Modules | up_proj, down_proj, gate_proj, k_proj, q_proj, v_proj, o_proj |
+| Quantization | 4-bit (NF4) |
+| GPU | NVIDIA RTX A6000 |
+
+---
+
+![Trained with Merlina](https://raw.githubusercontent.com/Schneewolf-Labs/Merlina/refs/heads/main/frontend/madewithmerlina_smol.png)
+
+[Merlina on GitHub](https://github.com/Schneewolf-Labs/Merlina)