d13f2dc5be9ff26c7e5f2d2625d9584e56cda6fc
Model: JunHowie/Qwen3-8B-Instruct Source: Original Platform
library_name, license, license_link, pipeline_tag, base_model
| library_name | license | license_link | pipeline_tag | base_model | |
|---|---|---|---|---|---|
| transformers | apache-2.0 | https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE | text-generation |
|
Qwen3-8B-Instruct
NOTE:This model is the Instruct-aligned variant, and it will not generate <think></think> blocks in its outputs.
Additionally, there is no need to specify enable_thinking=False anymore.
This model was trained using ms-swift as the post-training framework, with full-parameter SFT on 4 × 80GB GPUs.
The dataset used is the Chinese Distillation Dataset based on Qwen3-235B-2507
available at:
https://www.modelscope.cn/datasets/swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT
【vLLM Startup Command】
vllm serve JunHowie/Qwen3-8B-Instruct
【Dependencies】
vllm>=0.10.2
transformers>=4.56.1
【Model Download】
from modelscope import snapshot_download
snapshot_download('JunHowie/Qwen3-8B-Instruct', cache_dir="your_local_path")
Description
Languages
Jinja
100%