[Fusion] change fusion env variable (#6201)

### What this PR does / why we need it?
Since CI has integrated Triton, `fuse_qknorm_rope` is enabled by
default.

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added/existing test.


- vLLM version: v0.14.0
- vLLM main:
d68209402d

---------

Signed-off-by: wxsIcey <1790571317@qq.com>
This commit is contained in:
Icey
2026-01-24 22:49:33 +08:00
committed by GitHub
parent 6ccccad102
commit 7799c4ca3b
2 changed files with 5 additions and 5 deletions

View File

@@ -76,7 +76,8 @@ The details of each configuration option are as follows:
| Name | Type | Default | Description |
| ---- | ---- | ------- | ----------- |
| `fuse_norm_quant` | bool | `True` | Whether to enable fuse_norm_quant pass. |
| `fuse_qknorm_rope` | bool | `False` | Whether to enable fuse_qknorm_rope pass. It's set to True by default when Triton is installed. |
| `fuse_qknorm_rope` | bool | `True` | Whether to enable fuse_qknorm_rope pass. If Triton is not in the environment, set it to False. |
| `fuse_allreduce_rms` | bool | `False` | Whether to enable fuse_allreduce_rms pass. It's set to False because of conflict with SP. |
**eplb_config**

View File

@@ -17,7 +17,6 @@ import os
from typing import TYPE_CHECKING
from vllm.logger import logger
from vllm.triton_utils import HAS_TRITON
from vllm.utils.math_utils import cdiv
if TYPE_CHECKING:
@@ -190,7 +189,7 @@ class AscendCompilationConfig:
"""
def __init__(
self, fuse_norm_quant: bool = True, fuse_qknorm_rope: bool = False, fuse_allreduce_rms: bool = False, **kwargs
self, fuse_norm_quant: bool = True, fuse_qknorm_rope: bool = True, fuse_allreduce_rms: bool = False, **kwargs
):
"""
Initialize the configuration.
@@ -200,13 +199,13 @@ class AscendCompilationConfig:
When set to True, the system will optimize norm and quant operations.
Default: True
fuse_qknorm_rope (bool): Whether to enable qknorm and rope fusion optimization.
Default: False
Default: True
fuse_allreduce_rms (bool): Whether to enable allreduce and addrmsnorm fusion optimization.
Default: False
**kwargs: Additional optional parameters for forward compatibility and configuration extension.
"""
self.fuse_norm_quant = fuse_norm_quant
self.fuse_qknorm_rope = HAS_TRITON or fuse_qknorm_rope
self.fuse_qknorm_rope = fuse_qknorm_rope
self.fuse_allreduce_rms = fuse_allreduce_rms