--- base_model: Qwen/Qwen2.5-7B-Instruct datasets: - u-10bei/dbbench_sft_dataset_react - u-10bei/dbbench_sft_dataset_react_v2 - u-10bei/dbbench_sft_dataset_react_v3 - u-10bei/dbbench_sft_dataset_react_v4 language: - en license: apache-2.0 pipeline_tag: text-generation tags: - sft - agent - tool-use - dbbench - text-to-sql --- # Qwen2.5-7B DB Bench Combined SFT (v1-v4) This repository provides a **merged full-weight model** fine-tuned from **Qwen2.5-7B-Instruct** using **LoRA + Unsloth**, then merged to 16bit. ## Training Objective This model is trained to improve **DB Bench (database operation) performance** on the AgentBench evaluation benchmark. ALFWorld performance relies entirely on the base model's inherent capability (no ALFWorld training data used). Loss is applied to **all assistant turns** in the multi-turn trajectory, enabling the model to learn SQL generation, action selection, and error recovery. ## Training Data - DB Bench v1 (u-10bei/dbbench_sft_dataset_react): ~750 samples - DB Bench v2 (u-10bei/dbbench_sft_dataset_react_v2): ~750 samples - DB Bench v3 (u-10bei/dbbench_sft_dataset_react_v3): ~750 samples - DB Bench v4 (u-10bei/dbbench_sft_dataset_react_v4): ~750 samples - Total: ~3,000 samples - **ALFWorld data intentionally excluded** to preserve base model performance ## Training Configuration - Base model: Qwen/Qwen2.5-7B-Instruct - Method: LoRA → merged to 16bit - Max sequence length: 2048 - Epochs: 2 - Learning rate: 2e-6 - LoRA: r=64, alpha=128 - Batch size: 2, Gradient accumulation: 4 (effective batch 8) - Optimizer: AdamW (cosine scheduler) - Framework: Unsloth ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_id = "koguma-ai/dbbench-combined-baseline0301" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", ) ``` ## Sources & Terms Training data: u-10bei/dbbench_sft_dataset_react (v1-v4) Dataset License: Apache-2.0. Users must comply with the Apache-2.0 license and the base model's original terms of use. ## Limitations - Optimized for DB Bench tasks only - ALFWorld performance relies on base model capability - Weak categories: aggregation-MAX (16.7%), INSERT (33.3%)