Files
xc-llm-ascend/tests
Slightwind 4f6d60eb06 [Feature] Add W4A4 Flat Quantization support (#3427)
Introduce W4A4 Flat Quantization for better model compression and
inference efficiency on Ascend devices.

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: SlightwindSec <slightwindsec@gmail.com>
2025-10-13 23:20:16 +08:00
..