xc-llm-ascend

Files

pu-zhe 23524f2ca4 [Refactor]refactor 310p ops and add ut (#6591 )

### What this PR does / why we need it?
This pull request focuses on a significant refactoring effort within the
vllm-ascend project, specifically targeting operations optimized for the
Ascend 310P hardware. The changes aim to streamline the implementation
of core components like quantization and multi-head attention, making
the codebase more maintainable and robust. Concurrently, new unit tests
have been introduced to ensure the correctness and reliability of these
refactored modules.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
E2E test with qwen3-32b w8a8

- vLLM version: v0.15.0
- vLLM main:
d7e17aaacd

---------

Signed-off-by: pu-zhe <zpuaa@outlook.com>

2026-02-07 09:25:17 +08:00

__init__.py

[Feat.]: support 310p w8a8 (#6454 )

2026-02-03 14:13:06 +08:00

registry.py

[Feat.]: support 310p w8a8 (#6454 )

2026-02-03 14:13:06 +08:00

w8a8_static.py

[Refactor]refactor 310p ops and add ut (#6591 )

2026-02-07 09:25:17 +08:00