xc-llm-ascend

Files

shaopeng-666 176bfc36bc [BugFix] fix 3vl dense model load quant weight (#6100 )

### What this PR does / why we need it?
Fix Qwen3VL dense quant model load weights Error. 

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
The Qwen3VL quantized model service initialized successfully. Inference
requests are processed correctly, and valid responses are returned.

- vLLM version: v0.13.0
- vLLM main:
d68209402d

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>

2026-01-22 20:05:25 +08:00

compressed_tensors

[Quantization] Support compressed tensors moe w8a8 int8 dynamic weight (#5718 )

2026-01-14 09:17:26 +08:00

__init__.py

[Core] Cherry pick from 0.7.1 to keep the main code newest (#127 )

2025-02-21 17:07:37 +08:00

quant_config.py

[BugFix] fix 3vl dense model load quant weight (#6100 )

2026-01-22 20:05:25 +08:00

utils.py

[Feature] Add support of new W4A4_LAOS_DYNAMIC quantization method (#5143 )