xc-llm-ascend

Files

Qiu 13adcbe44b feat(attention_cp): support chunked prefill for Qwen3Next with PCP&DCP (#6900 )

### What this PR does / why we need it?
Support chunked prefill for Qwen3Next with PCP&DCP

- vLLM version: v0.16.0
- vLLM main:
15d76f74e2

---------

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

2026-03-09 17:55:09 +08:00

test_attention_cp.py

feat(attention_cp): support chunked prefill for Qwen3Next with PCP&DCP (#6900 )

2026-03-09 17:55:09 +08:00

test_attention_mask.py

[Refactor] 2/N Unify all mask generation methods and cache mask (#4779 )

2025-12-09 18:51:00 +08:00

test_attention_v1.py

[UT]: refactoring 310p ops ut (#6296 )

2026-01-27 16:31:51 +08:00

test_mla_cp.py

[Refact]Refact MLA/SFA weight prefetch to consist with moe weight prefetch (#6629 )

2026-02-10 14:14:37 +08:00

test_mla_v1.py

[Refact]Refact MLA/SFA weight prefetch to consist with moe weight prefetch (#6629 )

2026-02-10 14:14:37 +08:00

test_sfa_v1.py

clean 0.15.0 support (#6852 )

2026-02-28 09:20:57 +08:00

utils.py

[refactor](UT,PCP,DCP) refactor pcp&dcp patches in UTs (#5505 )

2026-01-05 09:05:45 +08:00