[Doc][P/D] Fix MooncakeConnector's name (#5172)
### What this PR does / why we need it?
vLLM community has integrated their MooncakeConnector. The original
scripts will now find this MooncakeConnector instead of the one from
vLLM-Ascend. All scripts that involve using the MooncakeConnector need
to be modified to another name.
### Does this PR introduce _any_ user-facing change?
Yes, users need to use a new name to load vLLM-Ascend MooncakeConnector.
### How was this patch tested?
By CI.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com>
This commit is contained in:
@@ -504,7 +504,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \
|
||||
--speculative-config '{"num_speculative_tokens": 1, "method":"deepseek_mtp"}' \
|
||||
--additional-config '{"recompute_scheduler_enable":true,"enable_shared_expert_dp": true}' \
|
||||
--kv-transfer-config \
|
||||
'{"kv_connector": "MooncakeConnector",
|
||||
'{"kv_connector": "MooncakeConnectorV1",
|
||||
"kv_role": "kv_producer",
|
||||
"kv_port": "30000",
|
||||
"engine_id": "0",
|
||||
@@ -564,7 +564,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \
|
||||
--speculative-config '{"num_speculative_tokens": 1, "method":"deepseek_mtp"}' \
|
||||
--additional-config '{"recompute_scheduler_enable":true,"enable_shared_expert_dp": true}' \
|
||||
--kv-transfer-config \
|
||||
'{"kv_connector": "MooncakeConnector",
|
||||
'{"kv_connector": "MooncakeConnectorV1",
|
||||
"kv_role": "kv_producer",
|
||||
"kv_port": "30100",
|
||||
"engine_id": "1",
|
||||
@@ -625,7 +625,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \
|
||||
--additional-config '{"recompute_scheduler_enable":true,"multistream_overlap_shared_expert": true,"lm_head_tensor_parallel_size":16}' \
|
||||
--compilation-config '{"cudagraph_mode": "FULL_DECODE_ONLY"}' \
|
||||
--kv-transfer-config \
|
||||
'{"kv_connector": "MooncakeConnector",
|
||||
'{"kv_connector": "MooncakeConnectorV1",
|
||||
"kv_role": "kv_consumer",
|
||||
"kv_port": "30200",
|
||||
"engine_id": "2",
|
||||
@@ -685,7 +685,7 @@ vllm serve /path_to_weight/DeepSeek-r1_w8a8_mtp \
|
||||
--additional-config '{"recompute_scheduler_enable":true,"multistream_overlap_shared_expert": true,"lm_head_tensor_parallel_size":16}' \
|
||||
--compilation-config '{"cudagraph_mode": "FULL_DECODE_ONLY"}' \
|
||||
--kv-transfer-config \
|
||||
'{"kv_connector": "MooncakeConnector",
|
||||
'{"kv_connector": "MooncakeConnectorV1",
|
||||
"kv_role": "kv_consumer",
|
||||
"kv_port": "30200",
|
||||
"engine_id": "2",
|
||||
|
||||
Reference in New Issue
Block a user