xc-llm-ascend

Files

Qiu 13adcbe44b feat(attention_cp): support chunked prefill for Qwen3Next with PCP&DCP (#6900 )

### What this PR does / why we need it?
Support chunked prefill for Qwen3Next with PCP&DCP

- vLLM version: v0.16.0
- vLLM main:
15d76f74e2

---------

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>

2026-03-09 17:55:09 +08:00

context_parallel

feat(attention_cp): support chunked prefill for Qwen3Next with PCP&DCP (#6900 )

2026-03-09 17:55:09 +08:00

__init__.py

[Core] Make V1 work and enable V1 engine test (#389 )

2025-03-28 19:34:23 +08:00

attention_mask.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #2 ) (#5977 )

2026-01-19 08:59:46 +08:00

attention_v1.py

[Refactor][EAGLE] 8/N delete mtp_proposer (re-pull) (#7033 )

2026-03-06 17:11:22 +08:00

mla_v1.py

[Refact]Refact MLA/SFA weight prefetch to consist with moe weight prefetch (#6629 )

2026-02-10 14:14:37 +08:00

sfa_v1.py

[perf][refactor] Refactor and optimize sfa_v1.py for dsv3.2/glm5 (#6874 )

2026-03-05 14:27:11 +08:00

utils.py

feat(attention_cp): support chunked prefill for Qwen3Next with PCP&DCP (#6900 )

2026-03-09 17:55:09 +08:00