Files
Qwen3-8B-Instruct/README.md

40 lines
1009 B
Markdown
Raw Normal View History

---
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
pipeline_tag: text-generation
base_model:
- Qwen/Qwen3-8B
---
# Qwen3-8B-Instruct
**NOTEThis model is the Instruct-aligned variant, and it will not generate ``<think></think>`` blocks in its outputs.
Additionally, there is no need to specify enable_thinking=False anymore.**
This model was trained using ms-swift as the post-training framework, with full-parameter SFT on 4 × 80GB GPUs.<br>
The dataset used is the Chinese Distillation Dataset based on Qwen3-235B-2507
<br>available at:
https://www.modelscope.cn/datasets/swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT
### 【vLLM Startup Command】
```
vllm serve JunHowie/Qwen3-8B-Instruct
```
### 【Dependencies】
```
vllm>=0.10.2
transformers>=4.56.1
```
### 【Model Download】
```python
from modelscope import snapshot_download
snapshot_download('JunHowie/Qwen3-8B-Instruct', cache_dir="your_local_path")
```