[bugfix] Support dsv3.2 enable both mtp and full_decode_only (#5679)

### What this PR does / why we need it?
#5230 this PR introduced a problem when both mtp and full_decode_only
are enabled for the DSV32 model, the operators cannot be compiled into
the graph. This PR fixes that issue.

- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef

Signed-off-by: cookieyyds <126683903+cookieyyds@users.noreply.github.com>
This commit is contained in:
cookieyyds
2026-01-08 15:47:31 +08:00
committed by GitHub
parent ccbc5e2ba1
commit 8b3a7a9e87

View File

@@ -167,7 +167,7 @@ class AscendSFAMetadataBuilder(MLACommonMetadataBuilder[AscendSFAMetadata]):
) -> AttentionCGSupport:
# Explicit override in case the underlying builder specialized this getter.
# @override omitted only because of mypy limitation due to type variable.
return AttentionCGSupport.UNIFORM_SINGLE_TOKEN_DECODE
return AttentionCGSupport.UNIFORM_BATCH
def reorder_batch(self, input_batch: "NPUInputBatch",
scheduler_output: "SchedulerOutput") -> bool: