[DP][V1] Fix rank set in DP scenario & Bump torch-npu version to 2.5.1.post1.dev20250528 (#1235)
### What this PR does / why we need it? 1. Fix rank set in DP scenario. The new poc version of torch-npu support setting `ASCEND_RT_VISIBLE_DEVICES` dynamically, thus we could use the rank set in `DPEngineCoreProc` directly instead of calculating local rank across dp by hand in the patched `_init_data_parallel` Closes: https://github.com/vllm-project/vllm-ascend/issues/1170 2. Bump torch-npu version to 2.5.1.post1.dev20250528 Closes: https://github.com/vllm-project/vllm-ascend/pull/1242 Closes: https://github.com/vllm-project/vllm-ascend/issues/1232 ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Icey <1790571317@qq.com>
This commit is contained in:
@@ -47,16 +47,7 @@
|
||||
# Related PR (if no, explain why):
|
||||
# Future Plan:
|
||||
# Remove those patch when vllm merged them
|
||||
# 2. `vllm.v1.engine.core.DPEngineCoreProc._init_data_parallel`
|
||||
# Why:
|
||||
# There is some bug for ASCEND_RT_VISIBLE_DEVICES usage.
|
||||
# How:
|
||||
# The ASCEND_RT_VISIBLE_DEVICES related code is dropped.
|
||||
# Related PR (if no, explain why):
|
||||
# No, this is a bug for vllm ascend
|
||||
# Future Plan:
|
||||
# Remove this patch once ASCEND_RT_VISIBLE_DEVICES bug is fixed.
|
||||
# 3. `vllm.config.ParallelConfig.get_next_dp_init_port`
|
||||
# 2. `vllm.config.ParallelConfig.get_next_dp_init_port`
|
||||
# Why:
|
||||
# vllm doesn't support get port from environment.
|
||||
# How:
|
||||
@@ -65,7 +56,7 @@
|
||||
# Need a PR to vllm to support get port from environment.
|
||||
# Future Plan:
|
||||
# Remove those patch when vllm merged them
|
||||
# 4. `vllm.config.ParallelConfig.ParallelConfig.stateless_init_dp_group`
|
||||
# 3. `vllm.config.ParallelConfig.ParallelConfig.stateless_init_dp_group`
|
||||
# Why:
|
||||
# vLLM use gloo backend by default to initialize stateless dp process gourp, but we want to use hccl here to
|
||||
# get better performance
|
||||
|
||||
Reference in New Issue
Block a user