xc-llm-ascend

Files

shaopeng-666 6e8d3681ae [bugdix] The problem that the w4a8 weight fails to be loaded when the EP is not enabled is resolved. (#7090 )

### What this PR does / why we need it?
This is a bug fix to resolve the issue where the MOE model fails to load
quantized weights in w4a8 format when EP is not enabled.The parameters
["weight_scale_second", "weight_offset_second", "scale_bias"] shall be
parsed in per-group mode, regardless of other conditions.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?

- vLLM version: v0.16.0
- vLLM main:
4034c3d32e

Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>

2026-03-10 16:57:05 +08:00

methods

[EPLB][bugfix] Bugfix for fused mc2 (#6794 )

2026-03-09 11:26:57 +08:00

__init__.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

compressed_tensors_config.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

method_adapters.py

[bugdix] The problem that the w4a8 weight fails to be loaded when the EP is not enabled is resolved. (#7090 )