Go to file

ModelHub XC c08027146f 初始化项目，由ModelHub XC社区提供模型

Model: suayptalha/Qwen3-0.6B-IF-Expert
Source: Original Platform

2026-05-06 21:02:57 +08:00

.gitattributes

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

added_tokens.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

configuration.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

generation_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

merges.txt

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

model.safetensors

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

README.md

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

special_tokens_map.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

tokenizer_config.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

tokenizer.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

vocab.json

初始化项目，由ModelHub XC社区提供模型

2026-05-06 21:02:57 +08:00

README.md

license, tags, datasets, language, base_model, pipeline_tag, library_name

license

Qwen3-0.6B-IF-Expert

This project performs full fine-tuning on the Qwen3-0.6B language model to enhance its instruction-following and reasoning capabilities. Training was conducted on the patrickfleith/instruction-freak-reasoning dataset using bfloat16 (bf16) precision for efficient optimization.

Training Procedure

Dataset Preparation
- The patrickfleith/instruction-freak-reasoning dataset was used.
- Each example contains a complex instruction paired with an in-depth reasoning-based response.
- Prompts were structured to encourage chain-of-thought style outputs when applicable.
Model Loading and Configuration
- Qwen3 base model weights were loaded via the unsloth library in bf16 precision.
- All model layers were fully updated (full_finetuning=True) to effectively adapt the model to instruction understanding and stepwise response generation.
Supervised Fine-Tuning
- Fine-tuning was conducted using the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
- The model was trained to follow detailed instructions, reason logically, and generate structured responses.

Purpose and Outcome

The model’s ability to follow complex instructions and explain its reasoning process has been significantly enhanced.
It generates both coherent reasoning steps and conclusive answers, improving transparency and usability for instruction-based tasks.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

README.md Unescape Escape

Qwen3-0.6B-IF-Expert

Training Procedure

Purpose and Outcome

License

Support

README.md