[main][bugfix]Fix vLLM startup failure when inferring DeepSeek R1 model in DP scenario (#2020)
### What this PR does / why we need it? Fix vLLM startup failure when inferring DeepSeek R1 model in DP scenario. When running vLLM inference for the DeepSeek R1 model in DP32+TP1 configuration, the vLLM service fails to start with the following error. <img width="1786" height="918" alt="21b2011042d4f77f36f5243fa64d9c18" src="https://github.com/user-attachments/assets/df1963fe-587e-43ca-822e-a9094d0034fb" /> The root cause is a missing else branch after [this line of code](d629f0b2b5/vllm_ascend/ops/fused_moe.py (L1411)). This PR fixes the issue. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.10.0 - vLLM main:5bbaf492a6--------- Signed-off-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com> Co-authored-by: zhanghaiwen <zhanghaiwen@cmss.chinamobile.com>
This commit is contained in:
@@ -1469,6 +1469,8 @@ class AscendFusedMoE(FusedMoE):
|
||||
e_hidden_states, dim=0)
|
||||
final_hidden_states = final_hidden_states[:num_tokens]
|
||||
dispose_tensor(e_hidden_states)
|
||||
else:
|
||||
final_hidden_states = e_hidden_states
|
||||
else:
|
||||
final_hidden_states = e_hidden_states
|
||||
|
||||
|
||||
Reference in New Issue
Block a user