xc-llm-ascend

Files

Li Wang 58adf7c8ac [Bugfix] Correctly handle the output shape in multimodal attention (#5443 )

### What this PR does / why we need it?
Fix https://github.com/vllm-project/vllm-ascend/issues/5297, for
`AscendMMEncoderAttention` forward, we should keep the output shape
consistence with the input

- vLLM version: release/v0.13.0
- vLLM main:
81786c8774

---------

Signed-off-by: wangli <wangli858794774@gmail.com>

2025-12-27 18:42:46 +08:00

e2e

[Bugfix] Correctly handle the output shape in multimodal attention (#5443 )

2025-12-27 18:42:46 +08:00

Revert "MLA prefill preformance optimization (#5275 )" (#5410 )

2025-12-27 09:48:56 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00