xc-llm-ascend

Files

linfeng-yuan 0fbe0831ec [bugfix][refactor] fix recompute_scheduler break with vllm 0.12.0 & support async scheduling & refactor recompute_scheduler.py (#4895 )

### What this PR does / why we need it?
Currently, the initialization and fundamental functions of
RecomputeScheduler are broken with `vLLM v0.12.0`. This PR fixes the
conflicts of `RecomputeScheduler` and refactor its implementations by
inheriting original `Scheduler` of vLLM. Meanwhile, this PR also
supports async cheduling with recompute scheduler by implementing
`AsyncRecomputeScheduler` which is simply inherited `AsncyScheduler` of
vLLM and `RecomputeScheduler` of vLLM-Ascend with python MRO.
### Does this PR introduce _any_ user-facing change?
No. The switch naming is the same as v0.11.0 :
`recompute_scheduler_enable`
### How was this patch tested?
E2E serving with 2P1D dsv3.1 passed. The performance was the same as
original vllm scheduler with `async_scheduling` and preempted requests
in D Nodes are successfully transfered to Proxy and further to P Node.
This significantly improves the performance and robustness of PD
disaggregation deployments.


- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: linfeng-yuan <1102311262@qq.com>

2025-12-11 22:24:49 +08:00

__init__.py

[Scheduler] Add AscendScheduler. (#543 )

2025-04-17 19:31:50 +08:00

recompute_scheduler.py

[bugfix][refactor] fix recompute_scheduler break with vllm 0.12.0 & support async scheduling & refactor recompute_scheduler.py (#4895 )

2025-12-11 22:24:49 +08:00

scheduler_dynamic_batch.py

upgrade vLLM to main (#4608 )

2025-12-02 22:10:52 +08:00