xc-llm-ascend

Files

xuyexiong 79821106e6 [BugFix]Fix mtp torchair bug caused by #2719 (#3566 )

### What this PR does / why we need it?
Fix mtp tochair bug cuased by #2719
Since FIA need extra space for padding, we need to enforce
`self.max_num_seqs > self.scheduler_config.max_num_seqs` in KV consumer
+ MTP
This means that, `self.max_num_seqs` **>** the actual maximum requests
(`self.scheduler_config.max_num_seqs`)

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: xuyexiong <xuyexiong@huawei.com>

2025-10-21 22:21:44 +08:00

__init__.py

[Misc][V0 Deprecation] Remove Cache Engine Used for V0 Worker (#1878 )

2025-07-19 09:42:32 +08:00

block_table.py

[HybridKV] Fix prefill disaggregation kvcache addr alignment & use hybrid kv cache only when running qwen3_next (#3007 )

2025-09-18 21:43:22 +08:00

model_runner_v1.py

[BugFix]Fix mtp torchair bug caused by #2719 (#3566 )

2025-10-21 22:21:44 +08:00

npu_input_batch.py

Drop 0.10.2 (#3284 )

2025-10-09 10:28:38 +08:00

worker_v1.py

[main] support cpu binding (#3546 )

2025-10-21 09:17:03 +08:00