xc-llm-ascend

Files

yeyifan 1191a64ae5 [Feat]attention add sliding windows size (#2528 )

### What this PR does / why we need it?
Add a sliding window size parameter to attention
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
Regarding the `Gemma3` model, set
additional_config={"ascend_scheduler_config": {"enabled":True}}, only
support AscendScheduler
test commond：`python3 -m vllm.entrypoints.openai.api_server --model
gemma3 --additional-config
'{"ascend_scheduler_config":{"enabled":true}}'`


- vLLM version: v0.10.1.1
- vLLM main:
6578e87365

---------

Signed-off-by: nsdie <yeyifan@huawei.com>

2025-08-28 10:37:19 +08:00

e2e

[main] [refactor] refactor fused_moe.py to enable token_dispatchers (#2570 )

2025-08-28 10:13:35 +08:00

[Feat]attention add sliding windows size (#2528 )

2025-08-28 10:37:19 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00