SILONG ZENG
ab37a7d5ae
[main]Upgrade cann to 8.3rc2 ( #4350 )
...
### What this PR does / why we need it?
Upgrade cann to 8.3rc2
### Does this PR introduce _any_ user-facing change?
Yes, docker image will use 8.3.RC2
- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2
---------
Signed-off-by: MrZ20 <2609716663@qq.com >
2025-11-28 14:06:01 +08:00
wangxiyuan
bc69d7cfe1
upgrade to vllm 0.11.2 ( #4400 )
...
Bump vLLM version to v0.11.2
What's broken and changed by vLLM:
1. structured_output is broken by
https://github.com/vllm-project/vllm/pull/26866
2. get_mrope_input_positions is broken by
https://github.com/vllm-project/vllm/pull/28399
3. graph mode is broken by
https://github.com/vllm-project/vllm/pull/25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
https://github.com/vllm-project/vllm/pull/27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
https://github.com/vllm-project/vllm/pull/28534
6. spec decode is broken by
https://github.com/vllm-project/vllm/pull/28771
7. sp feature is broken by
https://github.com/vllm-project/vllm/pull/27126
8. mtp is broken by https://github.com/vllm-project/vllm/pull/27922
9. lora is broken by https://github.com/vllm-project/vllm/pull/21068
10. execute_model is broken by
https://github.com/vllm-project/vllm/pull/26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
https://github.com/vllm-project/vllm/pull/28159
12. kv cahe is broken by https://github.com/vllm-project/vllm/pull/27753
13. dp is broken by https://github.com/vllm-project/vllm/pull/25110
What's broken and changed by ourself:
1. qwen vl is broken by https://github.com/vllm-project/vllm/pull/28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
https://github.com/vllm-project/vllm/pull/23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
https://github.com/vllm-project/vllm/pull/28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
https://github.com/vllm-project/vllm/pull/28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by https://github.com/vllm-project/vllm/pull/27764 .
We'll remove model files in the future
Known issue:
1. ray doesn't work
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.
Co-authored-by: MengqingCao <cmq0113@163.com >
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: leo-pony <nengjunma@outlook.com >
Co-authored-by: 22dimensions <waitingwind@foxmail.com >
Co-authored-by: shen-shanshan <467638484@qq.com >
- vLLM version: v0.11.2
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
Signed-off-by: MengqingCao <cmq0113@163.com >
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
Signed-off-by: leo-pony <nengjunma@outlook.com >
Co-authored-by: MengqingCao <cmq0113@163.com >
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: leo-pony <nengjunma@outlook.com >
2025-11-26 11:48:58 +08:00
dependabot[bot]
84eae97f27
Bump actions/checkout from 4 to 6 ( #4380 )
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 4 to 6.
- vLLM main:
2918c1b49c
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-25 09:05:11 +08:00
wangxiyuan
a1f142b7ad
Drop 0.11.0 support ( #4377 )
...
There is a lot hack code for v0.11.0, which makes the code hard to
upgrade to newer vLLM version. Since v0.11.0 will release soon. Let's
drop v0.11.0 support first. Then we'll upgrade to v0.11.2 soon.
- vLLM version: v0.11.0
- vLLM main:
2918c1b49c
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-11-24 17:08:20 +08:00
zhangxinyuehfad
d40ba52454
[Fix] fix Qwen2-Audio-7B-Instruct accuracy test ( #4017 )
...
### What this PR does / why we need it?
fix Qwen2-Audio-7B-Instruct accuracy test
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
2025-11-10 11:54:18 +08:00
zhangxinyuehfad
737cad2b6b
[Test] Refactor accuracy test to nightly test ( #3814 )
...
### What this PR does / why we need it?
Refactor accuracy test to nightly test
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
2025-11-06 09:06:59 +08:00