[Doc][KV Pool]Revision KV Pool User Guide (#7434)

### What this PR does / why we need it?
Revise the KV Pool user guide:
1. Revise Mooncake environment variables and kvconnector extra configs.
2. Delete `use_ascend_direct` in kv connector extra config as it is
deprecated
3. Delete `kv_buffer_device` and `kv_rank` in P2P mooncake config
4. Unifies default `max-model-len` and `max-num-batch-tokens` in
examples given.

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.17.0
- vLLM main:
4497431df6

---------

Signed-off-by: Pz1116 <zpbzpb123123@gmail.com>
Co-authored-by: Chao Lei <leichao139636@163.com>
This commit is contained in:
pz1116
2026-03-19 10:13:13 +08:00
committed by GitHub
parent ab9cd2e305
commit 3effc4bc70
8 changed files with 58 additions and 86 deletions

View File

@@ -124,7 +124,6 @@ vllm serve /path_to_weight/DeepSeek-V3.1_w8a8mix_mtp \
"kv_port": "30000",
"engine_id": "0",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 1,
"tp_size": 16
@@ -192,7 +191,6 @@ vllm serve /path_to_weight/DeepSeek-V3.1_w8a8mix_mtp \
"kv_port": "30000",
"engine_id": "1",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 1,
"tp_size": 16

View File

@@ -185,7 +185,6 @@ The template for the mooncake.json file is as follows:
"metadata_server": "P2PHANDSHAKE",
"protocol": "ascend",
"device_name": "",
"use_ascend_direct": true,
"master_server_address": "<your_server_ip>:50088",
"global_segment_size": 107374182400
}
@@ -195,7 +194,6 @@ The template for the mooncake.json file is as follows:
| --------------| ------------------------| -----------------------------------|
| metadata_server | P2PHANDSHAKE | Point-to-point handshake mode |
| protocol | ascend | Ascend proprietary protocol |
| use_ascend_direct | true | Enable direct hardware access |
| master_server_address | 90.90.100.188:50088(for example) | Master server address|
| global_segment_size | 107374182400 | Size per segment (100 GB) |

View File

@@ -564,7 +564,6 @@ Before you start, please
"kv_port": "30000",
"engine_id": "0",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 16
@@ -639,7 +638,6 @@ Before you start, please
"kv_port": "30000",
"engine_id": "0",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 16
@@ -716,7 +714,6 @@ Before you start, please
"kv_port": "30100",
"engine_id": "1",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 16
@@ -793,7 +790,6 @@ Before you start, please
"kv_port": "30100",
"engine_id": "1",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 16

View File

@@ -448,7 +448,6 @@ vllm serve vllm-ascend/Qwen3-235B-A22B-w8a8 \
"kv_port": "30000",
"engine_id": "0",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 8
@@ -513,7 +512,6 @@ vllm serve vllm-ascend/Qwen3-235B-A22B-w8a8 \
"kv_port": "30100",
"engine_id": "1",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 8
@@ -579,7 +577,6 @@ vllm serve vllm-ascend/Qwen3-235B-A22B-w8a8 \
"kv_port": "30100",
"engine_id": "1",
"kv_connector_extra_config": {
"use_ascend_direct": true,
"prefill": {
"dp_size": 2,
"tp_size": 8