[Fusion] normalize fusion naming and enable e2e test (#4693)

### What this PR does / why we need it?
This PR standardizes the fusion naming, changing
`enable_quantization_fusion` to `fuse_norm_quant`, and enables e2e
testing.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added/existing test.

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: wxsIcey <1790571317@qq.com>
This commit is contained in:
Icey
2025-12-11 17:53:43 +08:00
committed by GitHub
parent 07c7131104
commit 18221c0e1d
8 changed files with 136 additions and 113 deletions

View File

@@ -190,19 +190,18 @@ class AscendCompilationConfig:
deployed on Ascend platforms.
"""
def __init__(self, enable_quantization_fusion: bool = True, **kwargs):
def __init__(self, fuse_norm_quant: bool = True, **kwargs):
"""
Initialize the configuration.
Args:
enable_quantization_fusion (bool): Whether to enable quantization fusion optimization.
When set to True, the system will optimize quantization-related operations,
reducing the number of quantization/dequantization nodes.
fuse_norm_quant (bool): Whether to enable norm and quant fusion optimization.
When set to True, the system will optimize norm and quant operations.
Default: True
**kwargs: Additional optional parameters for forward compatibility and configuration extension.
"""
self.enable_quantization_fusion = enable_quantization_fusion
self.fuse_norm_quant = fuse_norm_quant
# Add more compilation related configs here as needed