[Doc] Add qwen3 embedding 8b guide (#1734)
1. Add the tutorials for qwen3-embedding-8b
2. Remove VLLM_USE_V1=1 in docs, it's useless any more from 0.9.2
- vLLM version: v0.9.2
- vLLM main:
5923ab9524
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# Multi-NPU (QwQ 32B W8A8)
|
||||
|
||||
## Run docker container:
|
||||
## Run docker container
|
||||
:::{note}
|
||||
w8a8 quantization feature is supported by v0.8.4rc2 or higher
|
||||
:::
|
||||
|
||||
Reference in New Issue
Block a user