xc-llm-ascend

Author	SHA1	Message	Date
Yikun Jiang	e4e9ea02ab	Upgrade vLLM version to v0.9.2 (#1652 ) ### What this PR does / why we need it? This patch upgrade vLLM version to v0.9.2, this patch didn't remove the v0.9.1 compatible code to easy review. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - vLLM version: v0.9.1 - vLLM main: `14601f5fba` - Accuracy test with 0.9.2: https://github.com/vllm-project/vllm-ascend/actions/runs/16121612087 Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-07-08 14:18:17 +08:00
Mengqing Cao	96fa7ff63b	[DP][V1] Fix rank set in DP scenario & Bump torch-npu version to 2.5.1.post1.dev20250528 (#1235 ) ### What this PR does / why we need it? 1. Fix rank set in DP scenario. The new poc version of torch-npu support setting `ASCEND_RT_VISIBLE_DEVICES` dynamically, thus we could use the rank set in `DPEngineCoreProc` directly instead of calculating local rank across dp by hand in the patched `_init_data_parallel` Closes: https://github.com/vllm-project/vllm-ascend/issues/1170 2. Bump torch-npu version to 2.5.1.post1.dev20250528 Closes: https://github.com/vllm-project/vllm-ascend/pull/1242 Closes: https://github.com/vllm-project/vllm-ascend/issues/1232 ### How was this patch tested? CI passed with new added test. --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Icey <1790571317@qq.com>	2025-06-16 23:09:53 +08:00
wangxiyuan	4f5964420e	[CI] Upgrade vllm to 0.9.1 (#1165 ) 1. upgrade vllm to 0.9.1. 0.9.0 is not supported for main branch now. keep doc to 0.9.0 until we release the first 0.9.1 release. 2. disable V0 test for PR 3. move actionlint check to lint job Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-06-11 16:33:11 +08:00
wangxiyuan	f6e5decc10	[CI] upgrade to vllm 0.9.0 (#959 ) Upgrade to vllm 0.9.0. 0.8.5 will not be supported any more. Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-05-28 21:18:41 +08:00
Yikun Jiang	5897dc5bbe	[Build] Bump vLLM version to v0.8.5.post1 (#755 ) ### What this PR does / why we need it? Bump vllm version to v0.8.5.post1 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-05-06 11:44:12 +08:00
Yikun Jiang	79538b5d73	Upgrade CANN version to 8.1.rc1 (#747 ) ### What this PR does / why we need it? Make CANN version bump separately from https://github.com/vllm-project/vllm-ascend/pull/708 - Upgrade CANN version to 8.1.rc1 - Add prefix to speed up download `m.daocloud.io/quay.io/ascend/cann:8.1.rc1-910b-ubuntu22.04-py3.10` - Address tail sapce for Dockerfile.openEuler - Add note for `/workspace` and `/vllm-workspace` as followup of https://github.com/vllm-project/vllm-ascend/pull/741 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? CI passed Co-authored-by: MengqingCao <cmq0113@163.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com>	2025-05-06 05:44:18 +08:00
Mengqing Cao	399b03830d	[Build][Bugfix] Fix source code path to avoid reference error (#726 ) ### What this PR does / why we need it? Fix source code path to avoid reference error in docker image fix https://github.com/vllm-project/vllm-ascend/issues/725 Signed-off-by: MengqingCao <cmq0113@163.com>	2025-04-30 17:38:13 +08:00
wangxiyuan	f8350569e6	[CI] upgrade vllm to 0.8.5 (#715 ) 1. Upgrade vllm to 0.8.5 2. Drop 0.8.4 support 3. Keep doc to 0.8.4rc2 until we release 0.8.5 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-04-30 09:15:50 +08:00
Yikun Jiang	2e20797934	[BUILD] Upgrade torch-npu to 2.5.1 (#661 ) ### What this PR does / why we need it? The torch-npu 2.5.1 are published: https://pypi.org/project/torch-npu/2.5.1/ It's time to remove all torch-npu dev version from vllm-ascend code base ### Does this PR introduce _any_ user-facing change? Yes, using torch-npu 2.5.1 ### How was this patch tested? - [ ] CI passed - [ ] Manually test - [ ] Grep all `dev2025` --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-04-27 17:28:29 +08:00
Yikun Jiang	086423dc35	[Docker] Bump Dockerfile version to v0.8.4 (#577 ) ### What this PR does / why we need it? Bump Dockerfile version to v0.8.4 ### Does this PR introduce _any_ user-facing change? docker image are using v0.8.4 version vLLM ### How was this patch tested? CI passed Closes: https://github.com/vllm-project/vllm-ascend/pull/571 Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-04-18 19:15:17 +08:00
hfadzxy	9935d45728	[CI]Add model basic accuracy test(Qwen2.5-0.5B-Instruct) (#460 ) ### What this PR does / why we need it? Add model basic accuracy test(Qwen2.5-0.5B-Instruct) Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2025-04-17 14:59:56 +08:00
wangxiyuan	9c7428b3d5	[CI] enable custom ops build (#466 ) ### What this PR does / why we need it? This PR enable custom ops build by default. ### Does this PR introduce _any_ user-facing change? Yes, users now install vllm-ascend from source will trigger custom ops build step. ### How was this patch tested? By image build and e2e CI --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-04-12 10:24:53 +08:00
wangxiyuan	ac1ba1d8d2	[Build] Fix x86 image build (#327 ) Install cpu version of pytorch in x86 to reduce image size Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-03-14 09:41:57 +08:00
wangxiyuan	9450e9811b	[CI] Uninstall triton in dockerfile (#298 ) triton doesn't work with ascend. We should make sure it's uninstalled in dockerfile Related: https://github.com/vllm-project/vllm-ascend/issues/291 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>	2025-03-12 07:14:57 +08:00
Yikun Jiang	268da28961	Pin modelscope<1.23.0 on vLLM v0.7.3 (#272 ) ### What this PR does / why we need it? Pin modelscope<1.23.0 on vLLM v0.7.3 to resolve: https://github.com/vllm-project/vllm/pull/13807 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-03-09 15:59:42 +08:00
Yikun Jiang	839dac8d60	Install wget to fix image build (#231 ) ### What this PR does / why we need it? Install `wget` to fix image build ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-03-04 09:01:23 +08:00
Yikun Jiang	ebe14f20cf	Recover vllm-ascend dev image (#209 ) ### What this PR does / why we need it? Recover vllm-ascend dev image ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-03-03 09:08:41 +08:00
Yikun Jiang	46740958f2	Add ray to docker image (#197 ) ### What this PR does / why we need it? Add ray to docker image to make `ray` work ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-02-28 15:23:18 +08:00
wangxiyuan	cff03a4913	[CI] change to quay.io (#102 ) change docker registry to quay Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-02-19 17:04:46 +08:00
wangxiyuan	fafd70e91c	[Doc] Update doc to work with release (#85 ) 1. Update CANN image name 2. Add pta install step 3. update vllm-ascend docker image name to ghcr 4. update quick_start to use vllm-ascend image directly. 5. fix `note` style Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-02-19 09:51:43 +08:00
Yikun Jiang	bfbfbce184	[CI] Add container image build ci (#64 ) ### What this PR does / why we need it? Add container image build ci: - Enable branch, tag docker image publish - branch image: `vllm-ascend:main`, `vllm-ascend:v0.7.1-dev` - tag image: `vllm-ascend:v0.7.1rc1` - Enable PR docker image build check - other changes: - Prepare the `REPO_OWNER` because the ghcr lowerercase required - Add `Free up disk space` step to avoid `No space left on device` like https://github.com/vllm-project/vllm-ascend/issues/27 - Setup qemu with image to resolve https://github.com/docker/setup-qemu-action/issues/198 ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? build: CI passed [push false](https://github.com/vllm-project/vllm-ascend/actions/runs/13347017608/job/37278724158?pr=64) Note for test case: 1. merge commits ot `main`, `v0.7.1-dev` branch ✅ main: https://github.com/Yikun/vllm-ascend/actions/runs/13347238961 --> ghcr.io/yikun/vllm-ascend:main OK ✅v0.7.1-dev: https://github.com/Yikun/vllm-ascend/actions/runs/13347229912 --> ghcr.io/yikun/vllm-ascend:v0.7.1-dev OK 2. create pep440 tag from github release: v0.7.1rc1, v0.7.1, v0.7.1rc1.dev1 all release has latest ✅ v0.7.5 --> v0.7.5, latest ✅ v0.7.5rc1 --> v0.7.5rc1 ✅ v0.7.5rc1.dev1 --> v0.7.5rc1.dev1 (no latest, add a todo here) v0.7.5rc1.post1 --> v0.7.5rc1.post1 3. create unknow tag from github release: ✅ create 0.7.1 on v0.7.1-dev: not trigger ( only prefix v triggerd) 4. create tag from git: ✅ also works, `git tag v0.7.99;git push origin v0.7.99` from publish-image Signed-off-by: Yikun Jiang <yikunkero@gmail.com>	2025-02-17 09:07:35 +08:00
Yikun Jiang	d5e7756028	[Core] Init vllm-ascend (#3 ) ### What this PR does / why we need it? vLLM Ascend plugin (vllm-ascend) is a backend plugin for running vLLM on the Ascend NPU. This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM. This patch also include changes to make CI work and use cache speed up e2e test, including: 1. Change push (post merge ci) and pull_request (pr ci) trigger branch to main 2. Make mypy work by ignore base_communicator and clear unused deps 3. Several improvements for vllm_ascend_test: - use cache (pip, ms, hf) speed up e2e test (25mins --> 5mins) - switch `git clone` command to `action/checkout` to speedup checkout and - Enable sv for pytest for better info dump - Remove network host to resole `docker: conflicting ontions: cannot attach both user-defined and non-user-definednetwork-modes`, which is a problem on docker 1.45 but not on 1.39. 4. Adapt MLA decode optimizations: `cabaf4eff3` ### Does this PR introduce _any_ user-facing change? Yes, init the PR. ### How was this patch tested? - This is the first PR to make ascend NPU work on vLLM. All code is tested on ascend with vLLM V0 Engine. - CI passed --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: wangshuai09 <391746016@qq.com> Co-authored-by: Shanshan Shen <467638484@qq.com> Co-authored-by: wangli <wangli858794774@gmail.com>	2025-02-05 10:53:12 +08:00

22 Commits