upgrade vLLM to main (#4608)
1. fix https://github.com/vllm-project/vllm/pull/28542 The model structure modifications we involved in are: - Qwen2.5-VL(still exist some patch) - Qwen2-VL - Qwen2 - DeepSeek series - Qwen-moe series 2. fix https://github.com/vllm-project/vllm/pull/29121 the output token now type changed from np to `list[list[int]]` 3. fix https://github.com/vllm-project/vllm/pull/29262 `xformers` backend for multimodal now has been deprecated 4. fix https://github.com/vllm-project/vllm/pull/29342 5. fix https://github.com/vllm-project/vllm/pull/28579 6. fix https://github.com/vllm-project/vllm/pull/28718 7. fix https://github.com/vllm-project/vllm/issues/28665 8. fix https://github.com/vllm-project/vllm/pull/26847 vllm introduced the `optimization-level`, some default config has been changed, and the param `--enforce-eager` has been deprecated 9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple for sampler. 10. fix https://github.com/vllm-project/vllm/pull/29471 we'll remove the related patch to avoid this kind of error. Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: wangli <wangli858794774@gmail.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com>
This commit is contained in:
@@ -42,7 +42,7 @@ jobs:
|
||||
lint:
|
||||
uses: ./.github/workflows/pre-commit.yml
|
||||
with:
|
||||
vllm: v0.11.2
|
||||
vllm: 86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24
|
||||
changes:
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
@@ -84,7 +84,7 @@ jobs:
|
||||
SOC_VERSION: ascend910b1
|
||||
strategy:
|
||||
matrix:
|
||||
vllm_version: [v0.11.2]
|
||||
vllm_version: [86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24]
|
||||
steps:
|
||||
- name: Install packages
|
||||
run: |
|
||||
@@ -142,7 +142,7 @@ jobs:
|
||||
name: e2e-light
|
||||
strategy:
|
||||
matrix:
|
||||
vllm_version: [v0.11.2]
|
||||
vllm_version: [86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24]
|
||||
# Note (yikun): If CI resource are limited we can split job into two chain jobs
|
||||
needs: [lint, changes]
|
||||
# only trigger e2e test after lint passed and the change is e2e related with pull request.
|
||||
|
||||
Reference in New Issue
Block a user