upgrade vLLM to main (#4608)

1. fix https://github.com/vllm-project/vllm/pull/28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix https://github.com/vllm-project/vllm/pull/29121 the output token now type changed from np to `list[list[int]]` 3. fix https://github.com/vllm-project/vllm/pull/29262 `xformers` backend for multimodal now has been deprecated 4. fix https://github.com/vllm-project/vllm/pull/29342 5. fix https://github.com/vllm-project/vllm/pull/28579 6. fix https://github.com/vllm-project/vllm/pull/28718 7. fix https://github.com/vllm-project/vllm/issues/28665 8. fix https://github.com/vllm-project/vllm/pull/26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix https://github.com/vllm-project/vllm/pull/29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>
2025-12-02 22:10:52 +08:00
parent 4588cdac02
commit 7f2673ea2d
60 changed files with 383 additions and 374 deletions
--- a/tests/ut/spec_decode/test_eagle_proposer.py
+++ b/tests/ut/spec_decode/test_eagle_proposer.py
@@ -224,7 +224,6 @@ class TestEagleProposerGenerateTokenIds(TestBase):

    def test_generate_token_ids_without_metadata(self):
        valid_sampled = [[20, 30, 40]]
-        valid_sampled = [np.array(sublist) for sublist in valid_sampled]
        scheduler_output = MagicMock()
        scheduler_output.num_scheduled_tokens = [2, 1, 3]
        positions = torch.tensor([0, 1, 2, 3, 4, 5])
@@ -251,7 +250,6 @@ class TestEagleProposerGenerateTokenIds(TestBase):

    def test_generate_token_ids_with_metadata(self):
        valid_sampled = [[5], [6, 7], [8, 9, 10]]
-        valid_sampled = [np.array(sublist) for sublist in valid_sampled]
        spec_metadata = MagicMock()
        spec_metadata.num_draft_tokens = [2, 3, 4]