xc-llm-ascend

Files

tanhaoan333 15f6564976 [Model]Add Qwen3-Omni quantization Ascend NPU adaptation and optimization (#6828 )

### What this PR does / why we need it?
This pull request is for quantization adaptation of Qwen3Omni, and it
achieves operator-level optimization and AUT (Auto-Quantization Tuning)
component optimization through patch-based modifications.
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.15.0
- vLLM main:
83b47f67b1

---------

Signed-off-by: tanhaoan333 <tanhaoan@huawei.com>

2026-03-03 00:07:23 +08:00

methods

[Model]Add Qwen3-Omni quantization Ascend NPU adaptation and optimization (#6828 )

2026-03-03 00:07:23 +08:00

__init__.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

compressed_tensors_config.py

[Lint]Style: Convert vllm-ascend/ to ruff format(Batch #7 ) (#6023 )

2026-02-06 14:56:53 +08:00

method_adapters.py

[Main2Main] Upgrade vLLM to 0226 (#6813 )