[CI] Fix EAGLE CI problems (#6702)

### What this PR does / why we need it?
New FIA operator requires queryT equal to the last element of
actualSequenceLengthQ.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Passed existing test (test_mtp_eagle_correctness.py).

- vLLM version: v0.15.0
- vLLM main:
9562912cea

---------

Signed-off-by: Wangbingjie <wangbj1207@126.com>
Signed-off-by: Wangbingjie <w30061490@china.huawei.com>
Co-authored-by: Wangbingjie <w30061490@china.huawei.com>
This commit is contained in:
Dijurido
2026-02-26 10:26:01 +08:00
committed by GitHub
parent 2870f7c8ad
commit 169e434f78
2 changed files with 15 additions and 2 deletions

View File

@@ -120,7 +120,6 @@ def test_deepseek_mtp_correctness(model_name: str, num_speculative_tokens: int,
del spec_llm
@pytest.mark.skip(reason="Failed with CANN8.5, fix me")
@pytest.mark.parametrize("model_name", MODELS_EAGLE)
@pytest.mark.parametrize("model_name_main", MODELS_MAIN)
@pytest.mark.parametrize("num_speculative_tokens", [1, 2])
@@ -169,7 +168,6 @@ def test_llama_qwen3_eagle_correctness(
"draft_tensor_parallel_size":
draft_tensor_parallel_size,
"max_model_len": 128,
"draft_vocab_size": 128256,
},
compilation_config=CompilationConfig(
cudagraph_mode="FULL_DECODE_ONLY",