初始化项目,由ModelHub XC社区提供模型
Model: cyumizou/qwen3-4b-structured-output-merged-stage-a Source: Original Platform
This commit is contained in:
82
README.md
Normal file
82
README.md
Normal file
@@ -0,0 +1,82 @@
|
||||
---
|
||||
base_model: Qwen/Qwen3-4B-Instruct-2507
|
||||
datasets:
|
||||
- u-10bei/structured_data_with_cot_dataset_512_v2
|
||||
language:
|
||||
- en
|
||||
license: apache-2.0
|
||||
pipeline_tag: text-generation
|
||||
tags:
|
||||
- structured-output
|
||||
- merged-weights
|
||||
- sft
|
||||
- qlora
|
||||
---
|
||||
|
||||
qwen3-4b-structured-output-merged-stage-a
|
||||
|
||||
This repository provides a **merged (fully materialized) model** derived from
|
||||
**Qwen/Qwen3-4B-Instruct-2507**. The weights were obtained by training a **LoRA adapter**
|
||||
and then **merging the adapter into the base model weights** (merge-and-unload).
|
||||
|
||||
✅ **You can load this model directly with `AutoModelForCausalLM.from_pretrained()`**
|
||||
❌ This is **NOT** an adapter-only repository.
|
||||
|
||||
## What this model is for (StageA)
|
||||
|
||||
This model corresponds to **StageA** in a two-stage training procedure.
|
||||
|
||||
**StageA goal:** stabilize *output mode* for structured generation:
|
||||
- reduce non-structured preambles (e.g., "Here/Sure")
|
||||
- reduce code-fences (```json / ```xml / ```yaml)
|
||||
- output only the required structured format reliably
|
||||
|
||||
This merged model is intended to be used as a stable starting point for StageB
|
||||
(TOML failure-pattern mitigation) without drifting back to chatty preambles.
|
||||
|
||||
## Training Objective
|
||||
|
||||
Improve **structured output reliability** (JSON / YAML / XML / TOML / CSV),
|
||||
especially eliminating non-structured preambles that break parsers.
|
||||
|
||||
## Training Configuration (StageA)
|
||||
|
||||
- Base model: Qwen/Qwen3-4B-Instruct-2507
|
||||
- Method: QLoRA (4-bit) with LoRA adapter, then merged into base weights
|
||||
- Max sequence length: 1024
|
||||
- Training length: 1 epoch(s) (or step-limited, if configured)
|
||||
- Learning rate: 2e-05
|
||||
- LoRA: r=8, alpha=16
|
||||
|
||||
Note: In StageA, loss is applied to the full assistant output to suppress preambles
|
||||
(if you used full-loss). If you used output-only loss, replace this sentence accordingly.
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from transformers import AutoModelForCausalLM, AutoTokenizer
|
||||
import torch
|
||||
|
||||
model_id = "your_id/your-repo" # this repo
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
model_id,
|
||||
torch_dtype=torch.bfloat16, # or float16 depending on your environment
|
||||
device_map="auto",
|
||||
)
|
||||
```
|
||||
## Compliance / Notes
|
||||
|
||||
This model is derived only from the organizer-approved base model
|
||||
(Qwen/Qwen3-4B-Instruct-2507) and uses no architecture changes.
|
||||
|
||||
The merge operation is used only to integrate post-training results
|
||||
(SFT/LoRA) under the same architecture.
|
||||
|
||||
## Sources & Terms (IMPORTANT)
|
||||
|
||||
Training data: u-10bei/structured_data_with_cot_dataset_512_v2
|
||||
|
||||
Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
|
||||
Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.
|
||||
Reference in New Issue
Block a user