xc-llm-ascend

Files

maxmgrdv ef9d8367f5 [Feature] Add support of new W4A4_LAOS_DYNAMIC quantization method (#5143 )

Introduce W4A4 LAOS Quantization for better model compression and
inference efficiency on Ascend devices.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

Signed-off-by: hfadzxy <starmoon_zhang@163.com>

2026-01-22 10:34:58 +08:00

e2e

[Feature] Add support of new W4A4_LAOS_DYNAMIC quantization method (#5143 )

2026-01-22 10:34:58 +08:00

[Refactor] AttentionBuilder inherit from base class in vllm (#5916 )

2026-01-21 10:45:45 +08:00

__init__.py

[SpecDecode] Add spec decode support (#500 )

2025-04-17 20:16:32 +08:00