[misc][FlashComm1][ACLGraph] Incompatibility between Flashcomm1 and FULL_DECODE_ONLY. (#5200)

### What this PR does / why we need it?
Currently, Flashcomm1 and FULL_DECODE_ONLY are incompatible. When both
features are enabled, graph capture errors occur without clear error
messages.

After discussion, it has been determined that enabling FULL_DECODE_ONLY
with Flashcomm1 in mixed deployment scenarios provides almost no TPOT
benefit. Additionally, a reconstruction of the decode phase for
flashcomm1 is currently underway. Therefore, related adaptation work is
temporarily postponed and will be addressed after the decode phase
reconstruction plan is finalized.

For now, an assert will be added to provide clear error messages and
correct deployment recommendations.

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
NO

- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c

---------

Signed-off-by: QiuChunshuo <qiuchunshuo@huawei.com>
This commit is contained in:
Qiu
2025-12-22 03:33:32 -03:00
committed by GitHub
parent b84ad8c5d8
commit 64669c4243

View File

@@ -5,7 +5,7 @@ import torch
import torch_npu
from torch import nn
from vllm.attention.backends.abstract import AttentionBackend, MLAAttentionImpl
from vllm.config import VllmConfig, get_current_vllm_config
from vllm.config import CUDAGraphMode, VllmConfig, get_current_vllm_config
from vllm.distributed import get_tensor_model_parallel_world_size, get_tp_group
from vllm.forward_context import get_forward_context
from vllm.logger import logger
@@ -148,6 +148,12 @@ class AscendSFAMetadataBuilder:
self.enable_sfa_cp = enable_sp() and \
hasattr(self.model_config.hf_config, "index_topk")
assert not (
self.enable_sfa_cp
and self.vllm_config.compilation_config.cudagraph_mode
== CUDAGraphMode.FULL_DECODE_ONLY
), "FlashComm1 is not compatible with FULL_DECODE_ONLY. Please set graph_mode to 'piecewise' or disable FlashComm1."
def reorder_batch(self, input_batch: "NPUInputBatch",
scheduler_output: "SchedulerOutput") -> bool:
# No need to reorder for Ascend SFA