Commit Graph

3 Commits

Author SHA1 Message Date
ming1212
9268ad11e3 Qwen3-Next:Update the gpu-memory-utilization parameter to 0.7 (#5129)
### What this PR does / why we need it?
Update the gpu-memory-utilization parameter to 0.7

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: ming1212 <2717180080@qq.com>
Signed-off-by: ming1212 <104972349+ming1212@users.noreply.github.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
2025-12-18 15:16:33 +08:00
ming1212
98b9e2e18e Add Qwen3-Next tutorials (#4607)
### What this PR does / why we need it?

This PR provides an introduction to the Qwen3-Next model, details on the
features supported by the model in the current version, the model
deployment process, as well as methods for performance testing and
accuracy testing.

With this document, the deployment and testing of the Qwen3-Next model
can be implemented more easily.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: ming1212 <2717180080@qq.com>
Signed-off-by: ming1212 <104972349+ming1212@users.noreply.github.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
2025-12-15 11:48:22 +08:00
wangxiyuan
e538fa6f9c [Doc] Update tutorial index (#4920)
Update tutorial index and remove useless doc

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-12-11 20:53:13 +08:00