Files
ModelHub XC f595d64212 初始化项目,由ModelHub XC社区提供模型
Model: plotMaker/qwen25-7b-sft-merged-v5v6-a50
Source: Original Platform
2026-05-19 13:57:23 +08:00

3.0 KiB

base_model, datasets, language, license, library_name, pipeline_tag, tags
base_model datasets language license library_name pipeline_tag tags
Qwen/Qwen2.5-7B-Instruct
u-10bei/sft_alfworld_trajectory_dataset_v2
u-10bei/sft_alfworld_trajectory_dataset_v3
u-10bei/sft_alfworld_trajectory_dataset_v4
u-10bei/sft_alfworld_trajectory_dataset_v5
u-10bei/dbbench_sft_dataset_react
u-10bei/dbbench_sft_dataset_react_v2
u-10bei/dbbench_sft_dataset_react_v3
u-10bei/dbbench_sft_dataset_react_v4
en
apache-2.0 transformers text-generation
sft
agent
tool-use
alfworld
dbbench

qwen25-7b-sft-merged-v5v6-a50

This repository provides a fully merged model fine-tuned from Qwen2.5-7B-Instruct using QLoRA + Unsloth.

Two SFT models (v5 and v6) were trained independently, then combined via weight interpolation (alpha=0.5). This is a complete model — no adapters or additional weights are needed.

Training Objective

This model is trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).

Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, and recovery from errors.

Training Configuration

  • Base model: Qwen/Qwen2.5-7B-Instruct
  • Method: QLoRA (4-bit) + Unsloth, merged into base model
  • Max sequence length: 2048
  • Epochs: 2
  • Learning rate: 5e-5
  • LoRA: r=32, alpha=64
  • Post-training: weight interpolation of v5 and v6 (alpha=0.5)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "plotMaker/qwen25-7b-sft-merged-v5v6-a50"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

References

Sources & Terms (IMPORTANT)

Training data:

  • u-10bei/sft_alfworld_trajectory_dataset_v2 ~ v5
  • u-10bei/dbbench_sft_dataset_react ~ v4

Base model: Qwen/Qwen2.5-7B-Instruct

This repository does NOT redistribute the dataset. Users must comply with the dataset license and base model terms.