xc-llm-ascend

Files

wangqiankun13 118b0ed346 [Feature] Add token mask for DispatchGmmCombineDecode operator (#5171 )

### What this PR does / why we need it?
In this PR, DispatchGmmCombineDecode add an optional input
x_active_mask, with which
only token masked True will be dispatched and handle.


- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: wangqiankun <wangqiankun13@huawei.com>

2025-12-19 16:31:48 +08:00

e2e

[Feature] Add token mask for DispatchGmmCombineDecode operator (#5171 )

2025-12-19 16:31:48 +08:00

[refactor] refactor weight trans nz and transpose (#4878 )

2025-12-19 14:27:24 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00