[Doc][KV Pool]Revision KV Pool User Guide (#7434)
### What this PR does / why we need it?
Revise the KV Pool user guide:
1. Revise Mooncake environment variables and kvconnector extra configs.
2. Delete `use_ascend_direct` in kv connector extra config as it is
deprecated
3. Delete `kv_buffer_device` and `kv_rank` in P2P mooncake config
4. Unifies default `max-model-len` and `max-num-batch-tokens` in
examples given.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.17.0
- vLLM main:
4497431df6
---------
Signed-off-by: Pz1116 <zpbzpb123123@gmail.com>
Co-authored-by: Chao Lei <leichao139636@163.com>
This commit is contained in:
@@ -60,7 +60,6 @@ deployment:
|
||||
"kv_port": "30000",
|
||||
"engine_id": "0",
|
||||
"kv_connector_extra_config": {
|
||||
"use_ascend_direct": true,
|
||||
"prefill": {
|
||||
"dp_size": 2,
|
||||
"tp_size": 16
|
||||
@@ -106,7 +105,6 @@ deployment:
|
||||
"kv_port": "30000",
|
||||
"engine_id": "0",
|
||||
"kv_connector_extra_config": {
|
||||
"use_ascend_direct": true,
|
||||
"prefill": {
|
||||
"dp_size": 2,
|
||||
"tp_size": 16
|
||||
@@ -152,7 +150,6 @@ deployment:
|
||||
"kv_port": "30200",
|
||||
"engine_id": "1",
|
||||
"kv_connector_extra_config": {
|
||||
"use_ascend_direct": true,
|
||||
"prefill": {
|
||||
"dp_size": 2,
|
||||
"tp_size": 16
|
||||
@@ -200,7 +197,6 @@ deployment:
|
||||
"kv_port": "30200",
|
||||
"engine_id": "1",
|
||||
"kv_connector_extra_config": {
|
||||
"use_ascend_direct": true,
|
||||
"prefill": {
|
||||
"dp_size": 2,
|
||||
"tp_size": 16
|
||||
|
||||
@@ -45,7 +45,6 @@ deployment:
|
||||
"kv_port": "30000",
|
||||
"engine_id": "0",
|
||||
"kv_connector_extra_config": {
|
||||
"use_ascend_direct": true,
|
||||
"prefill": {
|
||||
"dp_size": 2,
|
||||
"tp_size": 8
|
||||
@@ -84,7 +83,6 @@ deployment:
|
||||
"kv_port": "30100",
|
||||
"engine_id": "1",
|
||||
"kv_connector_extra_config": {
|
||||
"use_ascend_direct": true,
|
||||
"prefill": {
|
||||
"dp_size": 2,
|
||||
"tp_size": 8
|
||||
|
||||
Reference in New Issue
Block a user