xc-llm-ascend

Files

gh924 6880c1b383 [Feature] Support for cross-attention and whisper model (#5592 )

### What this PR does / why we need it?
To solve the problem of the
issue：https://github.com/vllm-project/vllm-ascend/issues/2262

- support for cross-attention when the model is encoder-decoder
- support for whisper model

- vLLM version: v0.13.0
- vLLM main:
7157596103

Signed-off-by: gh924 <guihao2@huawei.com>
Co-authored-by: Aoxuan Chen <43376869+chenaoxuan@users.noreply.github.com>

2026-01-11 11:38:45 +08:00

test_attention_cp.py

[Refactor]7/N Extract common code to common_cp (#5490 )

2026-01-05 17:41:12 +08:00

test_attention_mask.py

[Refactor] 2/N Unify all mask generation methods and cache mask (#4779 )

2025-12-09 18:51:00 +08:00

test_attention_v1.py

[Feature] Support for cross-attention and whisper model (#5592 )

2026-01-11 11:38:45 +08:00

test_mla_cp.py

[Refactor] Fix AttentionMaskBuilder singleton and remove redundant pcp_prefill_mask (#4870 )