xc-llm-ascend

Files

whx a58013440a [BugFix][MLA] Fix attn_mask bug for ring mla (#2704 )

This PR fix a bug related to attention mask used in ring mla. Current
ring mla has supported compressed mask, so we can directly use a 512 *
512 attention mask.

- vLLM version: v0.10.1.1
- vLLM main:
b5ee1e3261

---------

Signed-off-by: whx-sjtu <2952154980@qq.com>

2025-09-04 10:22:46 +08:00

test_attention_mask.py

[main][bugfix] Fix bugs and refactor cached mask generation logic (#2442 )

2025-08-27 12:07:29 +08:00

test_attention_v1.py

[Feat]attention add sliding windows size (#2528 )

2025-08-28 10:37:19 +08:00

test_mla_v1.py

[BugFix][MLA] Fix attn_mask bug for ring mla (#2704 )

2025-09-04 10:22:46 +08:00