[Refactor]Refactor of vllm_ascend/distributed module (#5719)
### What this PR does / why we need it?
Based on the RFC:https://github.com/vllm-project/vllm-ascend/issues/5604
This PR is a refactoring of vllm_ascend/distributed, moving all
kv_transfer realtaed codes into a dedicated folder, which has already
been done in vLLM
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef
---------
Signed-off-by: lty <linhebiwen@gmail.com>
This commit is contained in:
@@ -163,8 +163,7 @@ vllm serve vllm-ascend/DeepSeek-R1-W8A8 \
|
||||
"kv_role": "kv_producer",
|
||||
"kv_parallel_size": "1",
|
||||
"kv_port": "20001",
|
||||
"engine_id": "0",
|
||||
"kv_connector_module_path": "vllm_ascend.distributed.mooncake_connector"
|
||||
"engine_id": "0"
|
||||
}'
|
||||
--additional-config '{"enable_weight_nz_layout":true,"enable_prefill_optimizations":true}'
|
||||
```
|
||||
@@ -230,8 +229,7 @@ vllm serve vllm-ascend/DeepSeek-R1-W8A8 \
|
||||
"kv_role": "kv_consumer",
|
||||
"kv_parallel_size": "1",
|
||||
"kv_port": "20001",
|
||||
"engine_id": "0",
|
||||
"kv_connector_module_path": "vllm_ascend.distributed.mooncake_connector"
|
||||
"engine_id": "0"
|
||||
}' \
|
||||
--additional-config '{"enable_weight_nz_layout":true}'
|
||||
```
|
||||
@@ -435,8 +433,7 @@ In the PD separation scenario, we provide a optimized configuration.
|
||||
"kv_role": "kv_producer",
|
||||
"kv_parallel_size": "1",
|
||||
"kv_port": "20001",
|
||||
"engine_id": "0",
|
||||
"kv_connector_module_path": "vllm_ascend.distributed.mooncake_connector"
|
||||
"engine_id": "0"
|
||||
}'
|
||||
```
|
||||
|
||||
@@ -458,8 +455,7 @@ In the PD separation scenario, we provide a optimized configuration.
|
||||
"kv_role": "kv_consumer",
|
||||
"kv_parallel_size": "1",
|
||||
"kv_port": "20001",
|
||||
"engine_id": "0",
|
||||
"kv_connector_module_path": "vllm_ascend.distributed.mooncake_connector"
|
||||
"engine_id": "0"
|
||||
}'
|
||||
```
|
||||
|
||||
|
||||
Reference in New Issue
Block a user