dependabot[bot]
3c3c9a5386
Bump actions/checkout from 6.0.0 to 6.0.1 ( #4772 )
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 6.0.0 to 6.0.1.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-12-08 19:15:40 +08:00
Li Wang
752a55473c
[Misc] Upgrade vllm vllm commit to 2025_12_04 ( #4690 )
...
### What this PR does / why we need it?
As title shows, upgrade vllm commit hash to `ad32e3e`
- vLLM version: v0.12.0
---------
Signed-off-by: wangli <wangli858794774@gmail.com >
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-12-04 22:31:45 +08:00
wangxiyuan
3f4c0ea0a0
upgrade vLLM to 0.12.0 tag ( #4647 )
...
Upgrade vLLM to v0.12.0 tag
- vLLM version: 86e178f7c4d8c3b0eaf3c8e3f810a83f63b90e24
- vLLM main:
86e178f7c4
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-12-03 23:43:05 +08:00
wangxiyuan
7f2673ea2d
upgrade vLLM to main ( #4608 )
...
1. fix https://github.com/vllm-project/vllm/pull/28542
The model structure modifications we involved in are:
- Qwen2.5-VL(still exist some patch)
- Qwen2-VL
- Qwen2
- DeepSeek series
- Qwen-moe series
2. fix https://github.com/vllm-project/vllm/pull/29121
the output token now type changed from np to `list[list[int]]`
3. fix https://github.com/vllm-project/vllm/pull/29262
`xformers` backend for multimodal now has been deprecated
4. fix https://github.com/vllm-project/vllm/pull/29342
5. fix https://github.com/vllm-project/vllm/pull/28579
6. fix https://github.com/vllm-project/vllm/pull/28718
7. fix https://github.com/vllm-project/vllm/issues/28665
8. fix https://github.com/vllm-project/vllm/pull/26847
vllm introduced the `optimization-level`, some default config has been
changed, and the param `--enforce-eager` has been deprecated
9. fix http://github.com/vllm-project/vllm/pull/29223 it retuns tuple
for sampler.
10. fix https://github.com/vllm-project/vllm/pull/29471 we'll remove the
related patch to avoid this kind of error.
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: wangli <wangli858794774@gmail.com >
- vLLM version: v0.11.2
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
Signed-off-by: wangli <wangli858794774@gmail.com >
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: wangli <wangli858794774@gmail.com >
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
2025-12-02 22:10:52 +08:00
dependabot[bot]
e18e3067a7
Bump actions/checkout from 4.3.1 to 6.0.0 ( #4592 )
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 4.3.1 to 6.0.0.
- vLLM version: v0.11.2
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-02 11:59:25 +08:00
dependabot[bot]
8c65009d62
Bump actions/setup-python from 6.0.0 to 6.1.0 ( #4591 )
...
Bumps [actions/setup-python](https://github.com/actions/setup-python ) from 6.0.0 to 6.1.0.
- vLLM version: v0.11.2
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-01 14:32:08 +08:00
wangxiyuan
bc69d7cfe1
upgrade to vllm 0.11.2 ( #4400 )
...
Bump vLLM version to v0.11.2
What's broken and changed by vLLM:
1. structured_output is broken by
https://github.com/vllm-project/vllm/pull/26866
2. get_mrope_input_positions is broken by
https://github.com/vllm-project/vllm/pull/28399
3. graph mode is broken by
https://github.com/vllm-project/vllm/pull/25110 we'll upgrade torch to
2.8 to fix the problem later
4. embedding is broken by
https://github.com/vllm-project/vllm/pull/27583
5. `get_attn_backend_cls` and attention backend is broken are broken by
https://github.com/vllm-project/vllm/pull/28534
6. spec decode is broken by
https://github.com/vllm-project/vllm/pull/28771
7. sp feature is broken by
https://github.com/vllm-project/vllm/pull/27126
8. mtp is broken by https://github.com/vllm-project/vllm/pull/27922
9. lora is broken by https://github.com/vllm-project/vllm/pull/21068
10. execute_model is broken by
https://github.com/vllm-project/vllm/pull/26866
11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by
https://github.com/vllm-project/vllm/pull/28159
12. kv cahe is broken by https://github.com/vllm-project/vllm/pull/27753
13. dp is broken by https://github.com/vllm-project/vllm/pull/25110
What's broken and changed by ourself:
1. qwen vl is broken by https://github.com/vllm-project/vllm/pull/28455
We'll remove model files in the future to avoid this kind of error
2. Engine core is broken by
https://github.com/vllm-project/vllm/pull/23691 We'll remove the patch
file in the future.
3. Ascend scheduler is broken by
https://github.com/vllm-project/vllm/pull/28733 We'll remove ascend
scheudler later.
4. qwen3-next is broken by
https://github.com/vllm-project/vllm/pull/28083 We'll remove model files
in the future to avoid this kind of error
5. qwen vl is broken by https://github.com/vllm-project/vllm/pull/27764 .
We'll remove model files in the future
Known issue:
1. ray doesn't work
2. the accuracy of qwen3-next is not correct
3. qwen3-vl is broken
4. prefix cache+ ascend scheduler + deepseek v2 lite is broken.
Co-authored-by: MengqingCao <cmq0113@163.com >
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: leo-pony <nengjunma@outlook.com >
Co-authored-by: 22dimensions <waitingwind@foxmail.com >
Co-authored-by: shen-shanshan <467638484@qq.com >
- vLLM version: v0.11.2
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
Signed-off-by: MengqingCao <cmq0113@163.com >
Signed-off-by: hfadzxy <starmoon_zhang@163.com >
Signed-off-by: leo-pony <nengjunma@outlook.com >
Co-authored-by: MengqingCao <cmq0113@163.com >
Co-authored-by: hfadzxy <starmoon_zhang@163.com >
Co-authored-by: leo-pony <nengjunma@outlook.com >
2025-11-26 11:48:58 +08:00
dependabot[bot]
84eae97f27
Bump actions/checkout from 4 to 6 ( #4380 )
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 4 to 6.
- vLLM main:
2918c1b49c
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-25 09:05:11 +08:00
22dimensions
c272747d13
Upgrade to 0.11.1 newest vllm commit ( #3982 )
...
### What this PR does / why we need it?
adapt vllm-ascend main branch with vllm releases/v0.11.1
fix `forward context not set` in test_vlm.py caused by:
https://github.com/vllm-project/vllm/pull/23207
fix import `cdiv round` failed caused by:
https://github.com/vllm-project/vllm/pull/27188
fix import `init_cached_hf_modules` failed caused by:
https://github.com/vllm-project/vllm/pull/27567
adapt triton kernel `fused_recurrent_gated_delta_rule_fwd_kernel` caused
by: https://github.com/vllm-project/vllm/pull/27654
- remove unused code in sigmoid_gating.py
- `class FusedRecurrentFunction` , `fused_recurrent_gated_delta_rule`,
`fused_recurrent_gated_delta_rule_fwd`
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: 22dimensions <waitingwind@foxmail.com >
2025-11-12 23:01:19 +08:00
Meihan-chen
cba69e117e
[CI]pin vllm commit id ( #3861 )
...
### What this PR does / why we need it?
the code of vllm is updated, pin vllm commit id to recover CI firstly
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.11.0rc3
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.1
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com >
2025-10-29 17:43:58 +08:00
Icey
a7450db1bd
Upgrade to 0.11.1 newest vllm commit ( #3762 )
...
### What this PR does / why we need it?
c9461e05a4
Fix ```spec decode rejection sampler```, caused by
https://github.com/vllm-project/vllm/pull/26060
Fix some ```import```, caused by
https://github.com/vllm-project/vllm/pull/27374
Fix ```scheduler_config.send_delta_data```, caused by
https://github.com/vllm-project/vllm-ascend/pull/3719
Fix ```init_with_cudagraph_sizes```, caused by
https://github.com/vllm-project/vllm/pull/26016
Fix ```vl model```of replacing PatchEmbed's conv3d to linear layer,
caused by https://github.com/vllm-project/vllm/pull/27418
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with new added/existing test.
- vLLM version: v0.11.0rc3
- vLLM main:
c9461e05a4
---------
Signed-off-by: Icey <1790571317@qq.com >
2025-10-28 14:55:03 +08:00
Icey
d9cdc65854
Upgrade to new vllm commit ( #3719 )
...
### What this PR does / why we need it?
Upgrade to new vllm commit:
c9461e05a4
- Fix many imports, caused by
https://github.com/vllm-project/vllm/pull/26908
- Fix import ```sha256```, caused by
https://github.com/vllm-project/vllm/pull/27169
- Remove ```SchedulerConfig.send_delta_data```, caused by
https://github.com/vllm-project/vllm/pull/27142
- Fix ```FusedMoE``` because of dual stream execution, caused by
https://github.com/vllm-project/vllm/pull/26440
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with new added/existing test.
- vLLM version: v0.11.0rc3
- vLLM main:
17c540a993
---------
Signed-off-by: MengqingCao <cmq0113@163.com >
Signed-off-by: Icey <1790571317@qq.com >
Co-authored-by: MengqingCao <cmq0113@163.com >
2025-10-25 15:36:32 +08:00
Mengqing Cao
cea0755b07
[1/N][Refactor] Refactor code to adapt with vllm main ( #3612 )
...
### What this PR does / why we need it?
This is the step 1 of refactoring code to adapt with vllm main, and this
pr aligned with
17c540a993
1. refactor deepseek to the latest code arch as of
17c540a993
2. bunches of fixes due to vllm changes
- Fix `AscendScheduler` `__post_init__`, caused by
https://github.com/vllm-project/vllm/pull/25075
- Fix `AscendScheduler` init got an unexpected arg `block_size`, caused
by https://github.com/vllm-project/vllm/pull/26296
- Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by
https://github.com/vllm-project/vllm/pull/23485
- Fix `MLAAttention` import,caused by
https://github.com/vllm-project/vllm/pull/25103
- Fix `SharedFusedMoE` import, caused by
https://github.com/vllm-project/vllm/pull/26145
- Fix `LazyLoader` improt, caused by
https://github.com/vllm-project/vllm/pull/27022
- Fix `vllm.utils.swap_dict_values` improt, caused by
https://github.com/vllm-project/vllm/pull/26990
- Fix `Backend` enum import, caused by
https://github.com/vllm-project/vllm/pull/25893
- Fix `CompilationLevel` renaming to `CompilationMode` issue introduced
by https://github.com/vllm-project/vllm/pull/26355
- Fix fused_moe ops, caused by
https://github.com/vllm-project/vllm/pull/24097
- Fix bert model because of `inputs_embeds`, caused by
https://github.com/vllm-project/vllm/pull/25922
- Fix MRope because of `get_input_positions_tensor` to
`get_mrope_input_positions`, caused by
https://github.com/vllm-project/vllm/pull/24172
- Fix `splitting_ops` changes introduced by
https://github.com/vllm-project/vllm/pull/25845
- Fix multi-modality changes introduced by
https://github.com/vllm-project/vllm/issues/16229
- Fix lora bias dropping issue introduced by
https://github.com/vllm-project/vllm/pull/25807
- Fix structured ouput break introduced by
https://github.com/vllm-project/vllm/issues/26737
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
CI passed with existing test.
- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0
---------
Signed-off-by: MengqingCao <cmq0113@163.com >
Signed-off-by: Icey <1790571317@qq.com >
Co-authored-by: Icey <1790571317@qq.com >
2025-10-24 16:55:08 +08:00
wangxiyuan
a43e2f61e1
[CI] Update vLLM to v0.11.0 ( #3315 )
...
### What this PR does / why we need it?
There are 3 step to upgrade vllm-ascend to newest vllm. We'll create 3
PR
- [x] Upgrade vllm to v0.11.0 to make CI happy first .
- [ ] Move deepseek v3.2 to vllm way
- [ ] Then we'll add a new PR to add vllm main support.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.11.0
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-10-09 10:41:19 +08:00
wangxiyuan
e9359bd8fa
[CI] Pin vLLM to releases/v0.11.0 ( #3211 )
...
### What this PR does / why we need it?
- Pin vLLM commit to releases/v0.11.0 branch.
- Fix the break change by vLLM commit
d4d9899860
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
- vLLM version: v0.10.2
- vLLM main:
17b4c6685c
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-09-27 10:41:48 +08:00
wangxiyuan
2930e4a6bd
[CI] Upgrade vllm to newest commit ( #3182 )
...
### What this PR does / why we need it?
Upgrade vLLM to newest commit
- Fix the aclgraph doesn't work problem, caused by
24fab45d96
- Fix PoolerOutput import error, caused by
755ed7b05b
- Fix the aclgraph weight load error to keep the same with torchair fix.
4492e3a554
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
All test should pass
- vLLM version: v0.10.2
- vLLM main:
52d0cb8458
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-09-26 06:18:15 +08:00
wangxiyuan
ac1c2cd9ac
[CI] Upgrade vllm version - 0925 ( #3167 )
...
Upgrade vLLM to newest commit.
1. Remove the useless func get_state_cls, it has been removed from vLLM
already.
e6750d0b18
2. Fix ut broken by
6160ba4151
- vLLM version: v0.10.2
- vLLM main:
b1068903fd
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
2025-09-25 14:20:10 +08:00
wangxiyuan
a055183821
[CI] Upgrade vLLM version ( #3139 )
...
Upgrade vLLM version to the newest commit.
- Fix the break change introduced by
969b4da3a6
- Add a patch to quick fix torhcair
de94289a98
- fix the ut error introduced by
de94289a98
Close: https://github.com/vllm-project/vllm-ascend/issues/3138
- vLLM version: v0.10.2
- vLLM main:
f225ea7dd9
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com >
Signed-off-by: MengqingCao <cmq0113@163.com >
Co-authored-by: MengqingCao <cmq0113@163.com >
2025-09-25 07:36:51 +08:00
Li Wang
96eb1ed408
[CI] Bump vLLM commit hash to 0923(f225ea7) ( #3110 )
...
### What this PR does / why we need it?
Bump vLLM commit hash to
f225ea7dd9
### How was this patch tested?
- vLLM version: v0.10.2
- vLLM main:
5aeb925452
---------
Signed-off-by: wangli <wangli858794774@gmail.com >
2025-09-23 14:13:25 +08:00
Li Wang
02f89d166f
[CI] Update vllm version to 20250922(5aeb925) ( #3091 )
...
### What this PR does / why we need it?
This pr bump vllm commit hash to
5aeb925452
fix issues:
1. https://github.com/vllm-project/vllm/pull/25345 has remove v0
metadata
2. https://github.com/vllm-project/vllm/pull/25332
3. https://github.com/vllm-project/vllm/pull/25334
4. https://github.com/vllm-project/vllm/pull/23558 , note that this vllm
commit update the model register logic, which will check all the model
registered have the `vllm.model_executor.models` path , which breaks our
custom registration of the deepseek_v3 model (it doesn't exist in the
vllm model path). so I move deepseek_v3 model registy to deepseek_v2 to
solve temporary
### How was this patch tested?
- vLLM version: v0.10.2
- vLLM main:
9607d5eb44
---------
Signed-off-by: wangli <wangli858794774@gmail.com >
2025-09-22 22:18:13 +08:00
Mengqing Cao
f39bd309b6
[Hybrid KV] Follow up UniformTypeKVCacheSpecs ( #3070 )
...
### What this PR does / why we need it?
Follow up `UniformTypeKVCacheSpecs` changes introduced by
https://github.com/vllm-project/vllm/pull/25101 , which support different
hidden size in uniform type kvcache specs
This also fix the CI issue about `TypeError: AttentionGroup.__init__()
missing 1 required positional argument: 'kv_cache_spec'`
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
Tests passed with exsiting e2e tests.
- vLLM version: v0.10.2
- vLLM main:
c60e6137f0
---------
Signed-off-by: MengqingCao <cmq0113@163.com >
2025-09-22 15:02:41 +08:00
Yikun Jiang
b8b68b3dfe
[CI] Upgrade vLLM to 20250920 (c60e613) and address config break ( #3067 )
...
### What this PR does / why we need it?
Bump main to
c60e6137f0
- Updated imports in `vllm.config` to
`vllm.config.model`(aed16879a9 )
https://github.com/vllm-project/vllm/pull/25252
- Refactored `vllm_ascend/sample/sampler.py` to use string values for
`logprobs_mode` instead of the `LogprobsMode` enum, simplifying logprobs
mode handling and improving compatibility with recent vLLM changes
(aed16879a9 )
https://github.com/vllm-project/vllm/pull/25252
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
- vLLM version: v0.10.2
- vLLM main:
6d8246aaff
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com >
2025-09-21 09:49:17 +08:00
Li Wang
12bcbd02bb
[CI] Upgrade vLLM to 20250919 (6d8246aa) and fix some broken issue ( #2907 )
...
### What this PR does / why we need it?
1. This pr bump vllm commit to
6d8246aaff
2. fix upstream changes https://github.com/vllm-project/vllm/pull/24548
abort multi-modal kwargs, make vllm main and `v0.10.2` both adaptable
3. fix metadata_builder changes introduced by
https://github.com/vllm-project/vllm/pull/23693
4. fix `structured_outputs_config` changes introduced by
https://github.com/vllm-project/vllm/pull/22772
5. fix `moe_config` changes introduced by
https://github.com/vllm-project/vllm/pull/22537
Co-authored-by: MengqingCao <cmq0113@163.com >
Co-authored-by: Yikun Jiang <yikunkero@gmail.com >
- vLLM version: v0.10.2
- vLLM main:
c60e6137f0
---------
Signed-off-by: wangli <wangli858794774@gmail.com >
Signed-off-by: MengqingCao <cmq0113@163.com >
Co-authored-by: MengqingCao <cmq0113@163.com >
2025-09-20 17:37:57 +08:00
dependabot[bot]
3e60aa5483
Bump actions/setup-python from 5.4.0 to 6.0.0 ( #2926 )
...
Bumps [actions/setup-python](https://github.com/actions/setup-python ) from 5.4.0 to 6.0.0.
- vLLM version: v0.10.2
- vLLM main:
3f3313981c
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-16 14:15:10 +08:00
Mengqing Cao
b0403f8d8a
[CI] fix ci ( #2464 )
...
### What this PR does / why we need it?
1. use action/checkout@v5 instead of v4
2. remove dbo test case because there is issue with it and will be
refactored later
3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it
4. fix sampler api changes introduced by
https://github.com/vllm-project/vllm/pull/22387
6. fix qwen3 moe config changes intruoduced by
https://github.com/vllm-project/vllm/pull/20562
7. fix kvcache block changes introduced by
https://github.com/vllm-project/vllm/pull/23262
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with existing test.
- vLLM version: v0.10.0
- vLLM main:
0c6e40bbaa
---------
Signed-off-by: MengqingCao <cmq0113@163.com >
2025-08-22 07:30:48 +08:00
dependabot[bot]
8fb50a4248
Bump actions/checkout from 4 to 5 ( #2420 )
...
Bumps [actions/checkout](https://github.com/actions/checkout ) from 4 to 5.
- vLLM version: v0.10.0
- vLLM main:
5f5664b3e4
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-19 08:54:56 +08:00
Yikun Jiang
997f156a51
Use ci_vllm_version when recording vLLM commit ( #1689 )
...
### What this PR does / why we need it?
Use ci_vllm_version when recording vllm commit
Followup on https://github.com/vllm-project/vllm-ascend/pull/1623
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Test mannually.
$ python3 docs/source/conf.py | jq .ci_vllm_version | tr -d '"'
v0.9.2
- Test on my local repo: https://github.com/Yikun/vllm-ascend/pull/35
- vLLM version: v0.9.1
- vLLM main:
49e8c7ea25
Signed-off-by: Yikun Jiang <yikunkero@gmail.com >
2025-07-10 11:07:27 +08:00
Yikun Jiang
493768eb30
Record vLLM commit in PR description ( #1623 )
...
### What this PR does / why we need it?
This patch enables the vllm commits recording and also cleanup unused
commit msg note in PR.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- CI passed
- Test on https://github.com/Yikun/vllm-ascend/pull/33 and vllm commit
refreshed as expected.
Signed-off-by: Yikun Jiang <yikunkero@gmail.com >
2025-07-07 10:20:38 +08:00