[Bugfix] Add missing draft_attn_metadatas parameter to fix MTP test (#6232)

### What this PR does / why we need it? Fix the MTP test failure caused by accessing non-existent attribute `forward_context.draft_attn_metadatas`. **Root cause:** In `AscendAttentionBackendImpl.update_graph_params`, the code incorrectly accessed `forward_context.draft_attn_metadatas`, but `ForwardContext` class doesn't have this attribute. The original code passed this value via function parameter. **Fix:** Add `draft_attn_metadatas` parameter to the entire call chain: - `update_full_graph_params` function in `acl_graph.py` - All `update_graph_params` methods in attention backends - Pass the parameter correctly in `eagle_proposer.py` Also applied Gemini's suggestion to make `vllm_config=None` in `AscendAttentionCPImpl.update_graph_params` for API consistency. Related to item 9 in #5463 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This fixes the CI test failure: `test_deepseek_mtp_correctness[True-FULL_DECODE_ONLY-2-wemaster/deepseek_mtp_main_random_bf16]` Signed-off-by: lico67373 <918688502@qq.com>
2026-01-28 14:41:18 +08:00
parent f8e76a49fa
commit 379ce599d0
6 changed files with 10 additions and 3 deletions
--- a/vllm_ascend/spec_decode/eagle_proposer.py
+++ b/vllm_ascend/spec_decode/eagle_proposer.py
@@ -1184,7 +1184,8 @@ class EagleProposer(VllmEagleProposer):
    def _update_full_graph_params(self, forward_context, num_tokens, draft_attn_metadatas=None):
        update_full_graph_params(
            self.runner.attn_backend, self.update_stream, forward_context, num_tokens,
-            self.vllm_config, self.vllm_config.speculative_config)
+            self.vllm_config, self.vllm_config.speculative_config,
+            draft_attn_metadatas=draft_attn_metadatas)

    # padding tensor into desired size
    def _pad_tensor(self, tensor, pad_size):