xc-llm-ascend

Files

Shaoxu Cheng e0e585a109 [310P]: add torch chunk gated delta rule and 910b parity ut (#7594 )

### What this PR does / why we need it?
RFC https://github.com/vllm-project/vllm-ascend/issues/7394
Add a PyTorch implementation of the  chunk gated delta rule on 310P.
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
UT

---------

Signed-off-by: Tflowers-0129 <2906339855@qq.com>

2026-03-25 16:46:43 +08:00

attention

[Refactor] [310p] Support Mamba Cache and support attn_head_size larger than 128 (#7372 )

2026-03-19 09:16:22 +08:00

fused_moe

[refactor] replace scattered business kwargs with typed request objects and explicit stage boundaries (#7024 )

2026-03-20 23:23:57 +08:00

ops

[310P]: add torch chunk gated delta rule and 910b parity ut (#7594 )

2026-03-25 16:46:43 +08:00

quantization

[refactor] replace scattered business kwargs with typed request objects and explicit stage boundaries (#7024 )