初始化项目,由ModelHub XC社区提供模型
Model: JunHowie/Qwen3-8B-Instruct Source: Original Platform
This commit is contained in:
39
README.md
Normal file
39
README.md
Normal file
@@ -0,0 +1,39 @@
|
||||
---
|
||||
library_name: transformers
|
||||
license: apache-2.0
|
||||
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
|
||||
pipeline_tag: text-generation
|
||||
base_model:
|
||||
- Qwen/Qwen3-8B
|
||||
---
|
||||
# Qwen3-8B-Instruct
|
||||
|
||||
**NOTE:This model is the Instruct-aligned variant, and it will not generate ``<think></think>`` blocks in its outputs.
|
||||
Additionally, there is no need to specify enable_thinking=False anymore.**
|
||||
|
||||
|
||||
|
||||
|
||||
This model was trained using ms-swift as the post-training framework, with full-parameter SFT on 4 × 80GB GPUs.<br>
|
||||
The dataset used is the Chinese Distillation Dataset based on Qwen3-235B-2507
|
||||
<br>available at:
|
||||
https://www.modelscope.cn/datasets/swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT
|
||||
|
||||
|
||||
### 【vLLM Startup Command】
|
||||
```
|
||||
vllm serve JunHowie/Qwen3-8B-Instruct
|
||||
```
|
||||
|
||||
### 【Dependencies】
|
||||
```
|
||||
vllm>=0.10.2
|
||||
transformers>=4.56.1
|
||||
```
|
||||
|
||||
### 【Model Download】
|
||||
|
||||
```python
|
||||
from modelscope import snapshot_download
|
||||
snapshot_download('JunHowie/Qwen3-8B-Instruct', cache_dir="your_local_path")
|
||||
```
|
||||
Reference in New Issue
Block a user