xc-llm-ascend

Files

CaveNightingale 2bb7e55022 [Bugfix][PD]fix non-working disaggregated prefill (#2374 )

### What this PR does / why we need it?

Mainline vLLM fixes its disaggregated prefill in
https://github.com/vllm-project/vllm/pull/22598 . But it is still not
working in vllm-ascend.
To be concrete, decoder instances crash before vllm's fix and hang after
vllm's fix in ascend devices.
This patch allows disaggregated prefill to work.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Qwen3-0.6B 1P1D tp=1 dp=1


- vLLM version: v0.10.0
- vLLM main:
0fe85087a9

---------

Signed-off-by: CaveNightingale <cavenightingale@foxmail.com>

2025-08-15 16:59:52 +08:00

__init__.py

[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878 )

2025-07-19 09:42:32 +08:00

eagle_proposer_v1.py

[Misc] Fix logger bug (#2024 )

2025-07-28 15:59:09 +08:00

model_runner_v1.py

[Bugfix][PD]fix non-working disaggregated prefill (#2374 )

2025-08-15 16:59:52 +08:00

mtp_proposer_v1.py

[V1] MTP supports torchair (#2145 )