Files
qwen3-4b-structured-output-…/README.md
ModelHub XC 88caeb3107 初始化项目,由ModelHub XC社区提供模型
Model: cyumizou/qwen3-4b-structured-output-merged-stage-a
Source: Original Platform
2026-05-26 13:11:19 +08:00

83 lines
2.7 KiB
Markdown

---
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
- u-10bei/structured_data_with_cot_dataset_512_v2
language:
- en
license: apache-2.0
pipeline_tag: text-generation
tags:
- structured-output
- merged-weights
- sft
- qlora
---
qwen3-4b-structured-output-merged-stage-a
This repository provides a **merged (fully materialized) model** derived from
**Qwen/Qwen3-4B-Instruct-2507**. The weights were obtained by training a **LoRA adapter**
and then **merging the adapter into the base model weights** (merge-and-unload).
**You can load this model directly with `AutoModelForCausalLM.from_pretrained()`**
❌ This is **NOT** an adapter-only repository.
## What this model is for (StageA)
This model corresponds to **StageA** in a two-stage training procedure.
**StageA goal:** stabilize *output mode* for structured generation:
- reduce non-structured preambles (e.g., "Here/Sure")
- reduce code-fences (```json / ```xml / ```yaml)
- output only the required structured format reliably
This merged model is intended to be used as a stable starting point for StageB
(TOML failure-pattern mitigation) without drifting back to chatty preambles.
## Training Objective
Improve **structured output reliability** (JSON / YAML / XML / TOML / CSV),
especially eliminating non-structured preambles that break parsers.
## Training Configuration (StageA)
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit) with LoRA adapter, then merged into base weights
- Max sequence length: 1024
- Training length: 1 epoch(s) (or step-limited, if configured)
- Learning rate: 2e-05
- LoRA: r=8, alpha=16
Note: In StageA, loss is applied to the full assistant output to suppress preambles
(if you used full-loss). If you used output-only loss, replace this sentence accordingly.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "your_id/your-repo" # this repo
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16, # or float16 depending on your environment
device_map="auto",
)
```
## Compliance / Notes
This model is derived only from the organizer-approved base model
(Qwen/Qwen3-4B-Instruct-2507) and uses no architecture changes.
The merge operation is used only to integrate post-training results
(SFT/LoRA) under the same architecture.
## Sources & Terms (IMPORTANT)
Training data: u-10bei/structured_data_with_cot_dataset_512_v2
Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.