[Refactor] Rename cudagraph_support to aclgraph_support (#3104)
### What this PR does / why we need it?
Updates the `cudagraph_support` attribute to `aclgraph_support` to use
terminology appropriate for the Ascend platform (ACL graphs instead of
CUDA graphs).
This change also explicitly disables graph support for the MLA attention
backend.
### Does this PR introduce _any_ user-facing change?
None.
### How was this patch tested?
None needed.
- vLLM version: v0.10.2
- vLLM main:
5aeb925452
Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
This commit is contained in:
@@ -199,8 +199,8 @@ class AscendMetadata:
|
||||
|
||||
|
||||
class AscendAttentionMetadataBuilder:
|
||||
# Does this backend/builder support CUDA Graphs for attention (default: no).
|
||||
cudagraph_support: ClassVar[AttentionCGSupport] = \
|
||||
# Does this backend/builder support ACL Graphs for attention (default: no).
|
||||
aclgraph_support: ClassVar[AttentionCGSupport] = \
|
||||
AttentionCGSupport.UNIFORM_SINGLE_TOKEN_DECODE
|
||||
# Does this backend/builder reorder the batch?
|
||||
# If not, set this to None. Otherwise set it to the query
|
||||
|
||||
Reference in New Issue
Block a user