### What this PR does / why we need it?
For better usability, add multimodal audio to vllm compiling in
dockerfile defaultly.
Image size will increase only 2.xM.
Signed-off-by: Ting FU <futing10@huawei.com>
This PR upgrade CANN from 8.2rc1 to 8.3rc1 and remove the CANN version
check logic.
TODO: we notice that UT runs failed with CANN 8.3 image. So the base
image for UT is still 8.2. We'll fix it later.
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
There are 3 step to upgrade vllm-ascend to newest vllm. We'll create 3
PR
- [x] Upgrade vllm to v0.11.0 to make CI happy first .
- [ ] Move deepseek v3.2 to vllm way
- [ ] Then we'll add a new PR to add vllm main support.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.11.0
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
Bump version to v0.11.0rc2 and prepare vLLM Ascend v0.11.0rc0
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
- vLLM version: v0.10.2
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Bump vLLM version to v0.10.2
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
- vLLM version: v0.10.2rc3
- vLLM main:
15b8fef453
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
This reverts commit 339fceb89c.
### Does this PR introduce _any_ user-facing change?
Yes, use 8.2rc1 image by default
### How was this patch tested?
CI passed
- vLLM version: v0.10.2rc2
- vLLM main:
cfa3234a5b
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Upgrade CANN version to 8.3.rc1.alpha001
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.10.2rc2
- vLLM main:
89e08d6d18
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Upgrade vLLM version to 0.10.2rc2
### Does this PR introduce _any_ user-facing change?
Yes, image will use 0.10.2rc2 vLLM
### How was this patch tested?
- vLLM version: main
- vLLM main:
f17c075884
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Remove compatibility maintenance for vllm v0.10.1 and v0.10.1.1
### Does this PR introduce _any_ user-facing change?
branch main of vllm-ascend will not be compatible with vllm v0.10.1 and
v0.10.1.1
### How was this patch tested?
CI passed with existing test.
- vLLM version: v0.10.1.1
- vLLM main:
6fb2788163
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
1. use action/checkout@v5 instead of v4
2. remove dbo test case because there is issue with it and will be
refactored later
3. make vllm-ascend compatible with vllm v0.10.1.1 and add CI for it
4. fix sampler api changes introduced by
https://github.com/vllm-project/vllm/pull/22387
6. fix qwen3 moe config changes intruoduced by
https://github.com/vllm-project/vllm/pull/20562
7. fix kvcache block changes introduced by
https://github.com/vllm-project/vllm/pull/23262
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with existing test.
- vLLM version: v0.10.0
- vLLM main:
0c6e40bbaa
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
Fix protobuf version in Dockerfile to resolve `AttributeError: 'str'
object has no attribute 'DESCRIPTOR' when packaging message to dict`
using protobuf. will remove version specification after
https://github.com/ray-project/ray/pull/54910 is merged
### Does this PR introduce _any_ user-facing change?
N/A
### How was this patch tested?
CI passed with existing test.
- vLLM version: v0.10.0
- vLLM main:
0e36abf993
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
Upgrade CANN to 8.2.rc1
Backport: https://github.com/vllm-project/vllm-ascend/pull/1653
### Does this PR introduce _any_ user-facing change?
Yes, docker image will use 8.2.RC1
### How was this patch tested?
CI passed
- vLLM version: v0.10.0
- vLLM main:
7728dd77bb
Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
- Upgrade to v0.10.0
- Drop v0.9.2 version compatibility
- Add patch for
`vllm_ascend/patch/worker/patch_common/patch_sampler_gather_logprobs.py`
as workaround of
f3a683b7c9
for v0.10.0 and also add e2e test `test_models_prompt_logprobs`
- Pin transformers<4.54.0 as workaround of
https://github.com/vllm-project/vllm-ascend/issues/2034
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Test locally:
`VLLM_USE_MODELSCOPE=true pytest -sv
tests/e2e/singlecard/test_offline_inference.py::test_models_prompt_logprobs`
- CI passed
- vLLM version: v0.9.2
- vLLM main:
7728dd77bb
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
This patch upgrade vLLM version to v0.9.2, this patch didn't remove the
v0.9.1 compatible code to easy review.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.9.1
- vLLM main:
14601f5fba
- Accuracy test with 0.9.2:
https://github.com/vllm-project/vllm-ascend/actions/runs/16121612087
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
1. upgrade vllm to 0.9.1. 0.9.0 is not supported for main branch now.
keep doc to 0.9.0 until we release the first 0.9.1 release.
2. disable V0 test for PR
3. move actionlint check to lint job
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
Bump vllm version to v0.8.5.post1
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Make CANN version bump separately from
https://github.com/vllm-project/vllm-ascend/pull/708
- Upgrade CANN version to 8.1.rc1
- Add prefix to speed up download
`m.daocloud.io/quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10`
- Address tail sapce for Dockerfile.openEuler
- Add note for `/workspace` and `/vllm-workspace` as followup of
https://github.com/vllm-project/vllm-ascend/pull/741
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
CI passed
Co-authored-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
The torch-npu 2.5.1 are published:
https://pypi.org/project/torch-npu/2.5.1/
It's time to remove all torch-npu dev version from vllm-ascend code base
### Does this PR introduce _any_ user-facing change?
Yes, using torch-npu 2.5.1
### How was this patch tested?
- [ ] CI passed
- [ ] Manually test
- [ ] Grep all `dev2025`
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Bump Dockerfile version to v0.8.4
### Does this PR introduce _any_ user-facing change?
docker image are using v0.8.4 version vLLM
### How was this patch tested?
CI passed
Closes: https://github.com/vllm-project/vllm-ascend/pull/571
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
This PR enable custom ops build by default.
### Does this PR introduce _any_ user-facing change?
Yes, users now install vllm-ascend from source will trigger custom ops
build step.
### How was this patch tested?
By image build and e2e CI
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
Pin modelscope<1.23.0 on vLLM v0.7.3 to resolve:
https://github.com/vllm-project/vllm/pull/13807
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Install `wget` to fix image build
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Recover vllm-ascend dev image
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Add ray to docker image to make `ray` work
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Add container image build ci:
- Enable branch, tag docker image publish
- branch image: `vllm-ascend:main`, `vllm-ascend:v0.7.1-dev`
- tag image: `vllm-ascend:v0.7.1rc1`
- Enable PR docker image build check
- other changes:
- Prepare the `REPO_OWNER` because the ghcr lowerercase required
- Add `Free up disk space` step to avoid `No space left on device` like
https://github.com/vllm-project/vllm-ascend/issues/27
- Setup qemu with image to resolve
https://github.com/docker/setup-qemu-action/issues/198
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
build: CI passed [push
false](https://github.com/vllm-project/vllm-ascend/actions/runs/13347017608/job/37278724158?pr=64)
Note for test case:
1. merge commits ot `main`, `v0.7.1-dev` branch
✅ main: https://github.com/Yikun/vllm-ascend/actions/runs/13347238961
--> ghcr.io/yikun/vllm-ascend:main OK
✅v0.7.1-dev:
https://github.com/Yikun/vllm-ascend/actions/runs/13347229912 -->
ghcr.io/yikun/vllm-ascend:v0.7.1-dev OK
2. create pep440 tag from github release: v0.7.1rc1, v0.7.1,
v0.7.1rc1.dev1 all release has latest
✅ v0.7.5 --> v0.7.5, latest
✅ v0.7.5rc1 --> v0.7.5rc1
✅ v0.7.5rc1.dev1 --> v0.7.5rc1.dev1
(no latest, add a todo here) v0.7.5rc1.post1 --> v0.7.5rc1.post1
3. create unknow tag from github release:
✅ create 0.7.1 on v0.7.1-dev: not trigger ( only prefix v triggerd)
4. create tag from git:
✅ also works, `git tag v0.7.99;git push origin v0.7.99` from
publish-image
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
vLLM Ascend plugin (vllm-ascend) is a backend plugin for running vLLM on
the Ascend NPU.
This plugin is the recommended approach for supporting the Ascend
backend within the vLLM community. It adheres to the principles outlined
in the [RFC]: Hardware pluggable, providing a hardware-pluggable
interface that decouples the integration of the Ascend NPU with vLLM.
This patch also include changes to make CI work and use cache speed up
e2e test, including:
1. Change push (post merge ci) and pull_request (pr ci) trigger branch
to main
2. Make mypy work by ignore base_communicator and clear unused deps
3. Several improvements for vllm_ascend_test:
- use cache (pip, ms, hf) speed up e2e test (25mins --> 5mins)
- switch `git clone` command to `action/checkout` to speedup checkout
and
- Enable sv for pytest for better info dump
- Remove network host to resole `docker: conflicting ontions: cannot
attach both user-defined and non-user-definednetwork-modes`, which is a
problem on docker 1.45 but not on 1.39.
4. Adapt MLA decode optimizations:
cabaf4eff3
### Does this PR introduce _any_ user-facing change?
Yes, init the PR.
### How was this patch tested?
- This is the first PR to make ascend NPU work on vLLM. All code is
tested on ascend with vLLM V0 Engine.
- CI passed
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: MengqingCao <cmq0113@163.com>
Co-authored-by: wangshuai09 <391746016@qq.com>
Co-authored-by: Shanshan Shen <467638484@qq.com>
Co-authored-by: wangli <wangli858794774@gmail.com>