### What this PR does / why we need it?
Add a new single npu quantization tutorial, and using the latest qwen3
model.
- vLLM version: v0.10.0
- vLLM main:
8e8e0b6af1
Signed-off-by: 22dimensions <waitingwind@foxmail.com>
19 lines
283 B
Markdown
19 lines
283 B
Markdown
# Tutorials
|
|
|
|
:::{toctree}
|
|
:caption: Deployment
|
|
:maxdepth: 1
|
|
single_npu
|
|
single_npu_multimodal
|
|
single_npu_audio
|
|
single_npu_qwen3_embedding
|
|
single_npu_qwen3_quantization
|
|
multi_npu
|
|
multi_npu_moge
|
|
multi_npu_qwen3_moe
|
|
multi_npu_quantization
|
|
single_node_300i
|
|
multi_node
|
|
multi_node_kimi
|
|
:::
|