xc-llm-ascend

Files

Eric-dot 3c66a970f2 add mxfp8 moe quantization (#6670 )

### What this PR does / why we need it?
support mxfp8 quantization (Qwen MOE )
Using adaptor to make the hardware-specific behavior clearer and more
maintainable
### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
13397841ab

---------

Signed-off-by: fangrongcan <17343701736@163.com>
Signed-off-by: wangyao-i <iwangyao@outlook.com>
Signed-off-by: linfeng-yuan <1102311262@qq.com>
Signed-off-by: Eric-dot <60131170+Eric-dot@users.noreply.github.com>
Co-authored-by: fangrongcan <f00876277@china.huawei.com>
Co-authored-by: wangyao-i <iwangyao@outlook.com>
Co-authored-by: linfeng-yuan <1102311262@qq.com>

2026-03-02 11:04:06 +08:00

__init__.py

[Refactor] Provide a framework to accommodate operators for different hardware devices (#5735 )

2026-01-13 09:53:26 +08:00

device_op.py

add mxfp8 moe quantization (#6670 )

2026-03-02 11:04:06 +08:00