xc-llm-ascend

Author	SHA1	Message	Date
Li Wang	99e1ea0fe6	[v0.18.0][Misc] Upgrade torch_npu to pre-release built version (#7918 ) ### What this PR does / why we need it? This PR upgrades the `torch_npu` (PTA) version in multiple Dockerfiles to a pre-release build. It introduces logic to dynamically select the correct wheel based on the Python version and system architecture. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI passed with existing tests. The author should verify that the Docker images build successfully for all supported architectures and Python versions. --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2026-04-01 22:41:09 +08:00
zhangxinyuehfad	af4278be35	[v0.18.0][CI] Close build image by pr (#7776 ) ### What this PR does / why we need it? Close build image by pr This PR is related to https://github.com/vllm-project/vllm-ascend/pull/7775, please merge them together Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2026-03-31 16:38:43 +08:00
zhangxinyuehfad	124bb00158	[CI][v0.18.0] Build nightly image for releases/v0.18.0 per pr (#7662 ) ### What this PR does / why we need it? This patch add per pr image build for branch `releases/v0.18.0`, Due to the limitations of the quay naming convention, we should not name the image tag the same as branch name, we name the image tag`releases-v0.18.0` for daily build. Signed-off-by: hfadzxy <starmoon_zhang@163.com>	2026-03-26 16:48:51 +08:00
Mengqing Cao	e20f0b1a0d	[ReleaseNote] Add release note for v0.17.0rc1 (#7240 ) ### What this PR does / why we need it? This pull request adds the release notes for `v0.17.0rc1`. It also updates version numbers across various documentation files, including `README.md`, `README.zh.md`, `docs/source/community/versioning_policy.md`, and `docs/source/conf.py` to reflect the new release. - vLLM version: v0.17.0 - vLLM main: `4034c3d32e`	2026-03-15 22:47:47 +08:00
Mengqing Cao	1a83c8e2f5	[CI] Build Image for v0.16.0rc1 (#7155 ) ### What this PR does / why we need it? Build Image for v0.16.0rc1 - vLLM version: v0.16.0 - vLLM main: `4034c3d32e` Signed-off-by: MengqingCao <cmq0113@163.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-03-11 14:48:50 +08:00
SILONG ZENG	95b44d7b73	[bugfix]fix file not found error in nightly of single-node (#6976 ) ### What this PR does / why we need it? 1. The main image build takes approximately two hours. The main image build time needs to be moved forward to 21pm(UTC+8) to ensure that the nightly image build can use the latest main image. ``` bash schedule: # UTC+8: 8am, 12pm, 16pm, 22pm - cron: '0 0,4,8,14 * * ' ``` ---> ``` bash schedule: # UTC+8: 8am, 12pm, 16pm, 21pm - cron: '0 0,4,8,13 * *' ``` Link: https://github.com/vllm-project/vllm-ascend/actions/runs/22632712302/job/65641055135#step:8:26 2. The nightly test is encountering the following error: ``` bash ImportError: ascend_transport.so: cannot open shared object file: No such file or directory. ``` Path need to be added： ``` bash export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib" >> ~/.bashrc ``` Link: https://github.com/vllm-project/vllm-ascend/actions/runs/22632712302/job/65641054911#step:7:529 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: `15d76f74e2` --------- Signed-off-by: MrZ20 <2609716663@qq.com>	2026-03-04 11:47:26 +08:00
wjunLu	c324053b44	[CI] Revert speedup image building and CI Installation related PRs (#6891 ) ### What this PR does / why we need it? Revert speedup image building and CI Installation related PRs git revert `8835236181` git revert `64fba51275` git revert `263c2f8e8d` git revert `84b00695f8` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: `15d76f74e2` --------- Signed-off-by: wjunLu <wjunlu217@gmail.com>	2026-03-02 08:53:10 +08:00
wjunLu	84b00695f8	[CI] Refactor to speedup image building and CI Installation (#6708 ) ### What this PR does / why we need it? 1. Refactor image workflow using cache-from to speedup builds ![build](https://github.com/user-attachments/assets/02135c12-0069-44f8-a3ec-5c2b4282448a) Simultaneously refactored all Dockerfiles by placing layers that rarely change before those that change frequently, improving build cache hit rate. 2. Refactor E2E test using vllm-ascend container images, to skip C compile while no C code are changed ![e2e](https://github.com/user-attachments/assets/49f5b166-0df3-41e1-8f71-b3bbbed17cfd) In this case, the job will only replace the source code of vllm-ascend and install `requirements-dev.txt`, saving about 10min before tests ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: `9562912cea` Signed-off-by: wjunLu <wjunlu217@gmail.com>	2026-02-28 09:06:00 +08:00
wangxiyuan	d0bc16859c	[CI][Misc] Some improvement for github action (#6587 ) ### What this PR does / why we need it? - This PR removes several self-hosted runner labels from the `actionlint.yaml` configuration file. These runners are likely no longer in use, so this change cleans up the configuration and ensures `actionlint` has an accurate list of available runners. - Move all Action dockerfiles to one folder - remove useless `runner` input for e2e test. - update workflow option version ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This is a configuration change for the CI linter. The correctness will be verified by `actionlint` running in CI on subsequent pull requests. - vLLM version: v0.15.0 - vLLM main: `d7e17aaacd` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-02-06 14:06:27 +08:00
Li Wang	d018aeb5fa	[Image] Bump mooncake version to v0.3.8.post1 (#6428 ) ### What this PR does / why we need it? This patch bump the mooncake version to the latest [release](https://github.com/kvcache-ai/Mooncake/releases/tag/v0.3.8.post1) ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? test is locally >>> from mooncake.engine import TransferEngine - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` --------- Signed-off-by: wangli <wangli858794774@gmail.com>	2026-02-06 10:54:03 +08:00
wangxiyuan	f7dc7d9b86	[CI] support build wheel and docker image by workflow (#6453 ) Make image and wheel build CI job work with workflow_dispatch way - vLLM version: v0.14.1 - vLLM main: `dc917cceb8` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-02-01 20:06:22 +08:00
wangxiyuan	f4abd9b7b5	[CI] Fix 310p image build (#6259 ) Fix 310p docker image build error - vLLM version: v0.14.1 - vLLM main: `d68209402d` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-01-26 14:11:56 +08:00
Shaoxu Cheng	fbae41697e	[310P]: refactoring for 310p kvcache and some ops class (#6117 ) ### What this PR does / why we need it? * Refactor the LayerNorm and activation operator classes to decouple the 310P device implementation from the main branch. * Refactor `mm_encoder_attention` on 310P to use the `torch_npu._npu_flash_attention_unpad` operator. * Refactor the QKV inputs in the prefill stage of `attention_v1` on 310P so they are no longer padded to 16× alignment. * Refactor `model_runner` on 310P to align the KV-cache initialization logic with the mainline implementation. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? use the e2e tests. - vLLM version: v0.13.0 - vLLM main: `d68209402d` --------- Signed-off-by: Tflowers-0129 <2906339855@qq.com>	2026-01-24 20:34:29 +08:00
wangxiyuan	d36ca88cf4	[CI] Avoid lint and ut for PR push (#5762 ) 1. Don't run lint and ut again once the PR is merged to save CI resource 2. Update codecov every 4 hour 3. rename `model_downloader` to suitable name 4. update schedule job to better time. - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-01-09 15:57:06 +08:00
wangxiyuan	1ff1c96d13	[CI] Remove workflow_dispatch way for image build (#5742 ) There is some problem for workflow_dispatch way for image build. Let's remove it first to make CI happy. I'll add it back once it's well tested. - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-01-09 09:20:30 +08:00
wangxiyuan	d03cc9c456	[CI] Fix image build workflow_dispatch error (#5717 ) type `raw` must contain `value` section. This PR fix the image build error - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-01-08 15:07:33 +08:00
wangxiyuan	264cc254cc	[CI] fix image build tag (#5703 ) ref doesn't work with workflow_dispatch, let's change it to raw way This PR also merge the pr_create job into one runner to save resource. - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-01-08 09:27:45 +08:00
wangxiyuan	91790fd85a	[CI] move image and wheel job to schedule way (#5685 ) move image and wheel job to schedule way to save CI resource - vLLM version: v0.13.0 - vLLM main: `2f4e6548ef` Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2026-01-07 16:40:19 +08:00

18 Commits