xc-llm-ascend

Files

Ruri e31b31f9c3 [main][Bugfix] Fix unable to load qwen3_moe quantized weights (#2219 )

### What this PR does / why we need it?

Fixes unable to load `qwen3_moe` quantized weights issue due to #1994

### Does this PR introduce _any_ user-facing change?

None

### How was this patch tested?

Add a `qwen3_moe` W8A8 quantized model in
`tests/e2e/multicard/test_qwen3_moe.py`

- vLLM version: v0.10.0
- vLLM main:
c494f96fbc

---------

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

2025-08-06 09:08:36 +08:00

e2e

[main][Bugfix] Fix unable to load qwen3_moe quantized weights (#2219 )

2025-08-06 09:08:36 +08:00

[Test] add rejection sampler ut (#2084 )

2025-08-05 19:03:36 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00