[Feat][main] Supported to use full-graph with Qwen3-Next-MTP (#5477)
### What this PR does / why we need it?
Supported to use full-graph with Qwen3-Next-MTP.
In detail, we adatpted `AscendAttentionState.ChunkedPrefill` in main
model, and also adapted `AscendAttentionState.ChunkedPrefill` in mtp
model.
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
We changed the test of Qwen3-Next-MTP in
`tests/e2e/multicard/test_qwen3_next.py` to make it a test of
`FULL_DECODE_ONLY`. Then run `pytest -s
tests/e2e/multicard/test_qwen3_next.py::test_qwen3_next_distributed_mp_eager_mtp_similarity_tp4`.
And this test passed.
```text
.
================================================================================================================================= warnings summary =================================================================================================================================
<frozen importlib._bootstrap>:241
<frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute
<frozen importlib._bootstrap>:241
<frozen importlib._bootstrap>:241: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
==================================================================================================================== 1 passed, 2 warnings in 271.89s (0:04:31) =====================================================================================================================
```
- vLLM version: v0.13.0
- vLLM main:
5326c89803
Signed-off-by: drslark <slarksblood@qq.com>
This commit is contained in:
@@ -293,7 +293,7 @@ class AscendAttentionMetadataBuilder(AttentionMetadataBuilder[AscendMetadata]):
|
||||
)
|
||||
else:
|
||||
raise NotImplementedError(
|
||||
"Currently we only support building dummy metadata for DecodeOnly state"
|
||||
"Currently we only support building dummy metadata for DecodeOnly and ChunkedPrefill state"
|
||||
)
|
||||
|
||||
attn_metadata.attn_state = attn_state
|
||||
|
||||
Reference in New Issue
Block a user