qwen3-4b-structured-output-merged-stage-a

cyumizou/qwen3-4b-structured-output-merged-stage-a

Go to file

ModelHub XC 88caeb3107 初始化项目，由ModelHub XC社区提供模型

Model: cyumizou/qwen3-4b-structured-output-merged-stage-a
Source: Original Platform

2026-05-26 13:11:19 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

added_tokens.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

chat_template.jinja

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

model-00001-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

model-00002-of-00002.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

model.safetensors.index.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-26 13:11:19 +08:00

README.md

base_model, datasets, language, license, pipeline_tag, tags

base_model

datasets

language

license

pipeline_tag

What this model is for (StageA)

This model corresponds to StageA in a two-stage training procedure.

StageA goal: stabilize output mode for structured generation:

reduce non-structured preambles (e.g., "Here/Sure")
reduce code-fences (json / xml / ```yaml)
output only the required structured format reliably

This merged model is intended to be used as a stable starting point for StageB (TOML failure-pattern mitigation) without drifting back to chatty preambles.

Training Objective

Improve structured output reliability (JSON / YAML / XML / TOML / CSV), especially eliminating non-structured preambles that break parsers.

Training Configuration (StageA)

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: QLoRA (4-bit) with LoRA adapter, then merged into base weights
Max sequence length: 1024
Training length: 1 epoch(s) (or step-limited, if configured)
Learning rate: 2e-05
LoRA: r=8, alpha=16

Note: In StageA, loss is applied to the full assistant output to suppress preambles (if you used full-loss). If you used output-only loss, replace this sentence accordingly.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "your_id/your-repo"  # this repo

tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,   # or float16 depending on your environment
    device_map="auto",
)

Compliance / Notes

This model is derived only from the organizer-approved base model (Qwen/Qwen3-4B-Instruct-2507) and uses no architecture changes.

The merge operation is used only to integrate post-training results (SFT/LoRA) under the same architecture.

Sources & Terms (IMPORTANT)

Training data: u-10bei/structured_data_with_cot_dataset_512_v2

Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License. Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.