xc-llm-ascend

Files

hucong 292cf339c3 [BugFix][P/D] Modify the recalculation logic to prevent waiting requests from filling up the D node KVCache (#3641 )

### What this PR does / why we need it?
Modify the recalculation logic to prevent waiting requests from filling
up the D node KVCache

- vLLM version: v0.11.0rc3
- vLLM main:
17c540a993

Signed-off-by: underfituu <hzhucong@163.com>

2025-10-25 09:14:20 +08:00

__init__.py

[Scheduler] Add AscendScheduler. (#543 )

2025-04-17 19:31:50 +08:00

recompute_schedule_config.py

[Bugfix] Route requests requiring KVC recomputation from the decode instance to the P instance (#3448 )

2025-10-18 15:56:44 +08:00

recompute_scheduler.py

[BugFix][P/D] Modify the recalculation logic to prevent waiting requests from filling up the D node KVCache (#3641 )

2025-10-25 09:14:20 +08:00

schedule_config.py

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00

scheduler_dynamic_batch.py

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00

scheduler.py

[1/N][Refactor] Refactor code to adapt with vllm main (#3612 )

2025-10-24 16:55:08 +08:00