25 Commits

Author SHA1 Message Date
zhangxinyuehfad
ca0823f238 [0.11.0][Bugfix] fix fastapi version (#5052)
### What this PR does / why we need it?
fix fastapi version

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-12-16 11:34:11 +08:00
zhangxinyuehfad
a686f2962a [0.11.0][Bugfix] fix e2e full test (#4424)
### What this PR does / why we need it?
pin Transformer version to 4.57.1 fix 'dict' object has no attribute
'model_type'

https://github.com/vllm-project/vllm-ascend/actions/runs/19660859460/job/56306822464

picked from https://github.com/vllm-project/vllm-ascend/pull/4423


Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-11-25 21:21:42 +08:00
wangxiyuan
8a7154001e [0.11.0]Chery pick pta upgrade change (#3940)
This PR cherry-pick two commit from main to upgrade torch-npu to 2.7.1
official release

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-10-31 22:14:26 +08:00
Mengqing Cao
1b16c01afd [v0.11.0-dev][Installation] limit opencv-python-headless version to resolve numpy version conflict (#3767)
### What this PR does / why we need it?
vllm requires opencv-python-headless >= 4.11.0 which requires
(numpy<2.3.0,>=2), but vllm-ascend numpy version must be less than
2.0.0, so limit opencv-python-headless less than 4.11.0.86 will fix this
conflict.

backport of
afc58184ec

Signed-off-by: 22dimensions <waitingwind@foxmail.com>
Co-authored-by: 22dimensions <waitingwind@foxmail.com>
2025-10-25 18:18:28 +08:00
wangxiyuan
ba19dd3183 Revert PTA upgrade PR (#3352)
we notice that torch npu 0919 doesn't work. This PR revert related
change which rely on 0919 version.
Revert PR: #3295  #3205  #3102 

Related: #3353

- vLLM version: v0.11.0
2025-10-10 14:09:53 +08:00
wangxiyuan
4abdcdba4e upgrade pta to 0919 (#3295)
### What this PR does / why we need it?
Upgrade torch-npu to the newest POC version
### Does this PR introduce _any_ user-facing change?
yes, user need upgrade the pta version as well.
### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-09-30 17:14:23 +08:00
wangxiyuan
9260910c8d [CI] Fix broken CI (#2302)
1. disable test_eagle_ccorrectness test, we'll reopen it once oom error
fixed.
2. drop transformers version limit for main, since vLLM rely on
>=4.55.0, see:
65552b476b
3. fix kv_connector_output bug, see:
796bae07c5

- vLLM version: v0.10.0
- vLLM main:
d1af8b7be9

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-08-11 11:22:32 +08:00
leo-pony
807f0895b2 Bump torch version to 2.7.1 (#1562)
### What this PR does / why we need it?
Bump torch version to 2.7.1, and cleanup infer schema patch
https://github.com/vllm-project/vllm-ascend/commit/857f489
(https://github.com/vllm-project/vllm-ascend/pull/837), this patch
depends on also: https://github.com/vllm-project/vllm-ascend/pull/1974

### Does this PR introduce any user-facing change?
No

#### How was this patch tested?
CI passed

torch-npu 2.7.1rc1 install guide:
https://gitee.com/ascend/pytorch/tree/v2.7.1/
install depending:
```
pip3 install pyyaml
pip3 install setuptools
```
install torch-npu:

Closes: https://github.com/vllm-project/vllm-ascend/issues/1866
Closes: https://github.com/vllm-project/vllm-ascend/issues/1390


- vLLM version: v0.10.0
- vLLM main:
9af654cc38

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-08-05 08:43:24 +08:00
Yikun Jiang
17a430f7b8 Upgrade vLLM to v0.10.0 (#1927)
### What this PR does / why we need it?
- Upgrade to v0.10.0
- Drop v0.9.2 version compatibility
- Add patch for
`vllm_ascend/patch/worker/patch_common/patch_sampler_gather_logprobs.py`
as workaround of
f3a683b7c9
for v0.10.0 and also add e2e test `test_models_prompt_logprobs`
- Pin transformers<4.54.0 as workaround of
https://github.com/vllm-project/vllm-ascend/issues/2034

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Test locally:
`VLLM_USE_MODELSCOPE=true pytest -sv
tests/e2e/singlecard/test_offline_inference.py::test_models_prompt_logprobs`
- CI passed

- vLLM version: v0.9.2
- vLLM main:
7728dd77bb

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-07-26 15:43:29 +08:00
Li Wang
bdfb065b5d [1/2/N] Enable pymarkdown and python __init__ for lint system (#2011)
### What this PR does / why we need it?
1. Enable pymarkdown check
2. Enable python `__init__.py` check for vllm and vllm-ascend
3. Make clean code

### How was this patch tested?


- vLLM version: v0.9.2
- vLLM main:
29c6fbe58c

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-07-25 22:16:10 +08:00
wangxiyuan
2b726d8f90 [CI] Fix broken CI (#1889)
1. vLLM commit
45badd05d0
changed the pooling check logic which broken vLLM Ascend.
2. vLLM commit
3e04107d97
requires higher version of transformers. The transformers version bug
has been fixed by
e936e401de.
We can safe to remove the version limit now.
3. vLLM commit
217937221b
added a new input `enable_eplb` for FusedMoe Ops

This PR fix the broken CI.


- vLLM version: v0.9.2
- vLLM main:
6a971ed692

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-20 02:11:57 +08:00
Mengqing Cao
2cf9c4c3a2 [CI/Build] Fix version conflict on transformers (#1490)
### What this PR does / why we need it?
Fix version conflict on transformers:
`pip._vendor.pkg_resources.ContextualVersionConflict: (transformers
4.53.0 (/usr/local/python3.10.17/lib/python3.10/site-packages),
Requirement.parse('transformers<4.53.0'), {'vllm-ascend'})`
Fix
https://github.com/vllm-project/vllm-ascend/actions/runs/15933263325/job/44947231642

### Does this PR introduce _any_ user-facing change?
Fix broken build

### How was this patch tested?
CI passed with new existing test.

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-06-28 15:11:04 +08:00
Mengqing Cao
d59e7fa095 [CI] Pin transformers<4.53.0 and fix EPLB load_weights to make CI passed (#1482)
### What this PR does / why we need it?

- Fix vLLM EPLB break
e9fd658a73
by recovering load_weights back to [v0.9.1
version](07b8fae219)
temporarily.

- Fix transformers>=4.53.0 image processor break
Related: https://github.com/vllm-project/vllm-ascend/issues/1470

- Mirror torch_npu requirements to pyproject.toml

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-06-28 00:12:43 +08:00
Mengqing Cao
96fa7ff63b [DP][V1] Fix rank set in DP scenario & Bump torch-npu version to 2.5.1.post1.dev20250528 (#1235)
### What this PR does / why we need it?
1. Fix rank set in DP scenario. The new poc version of torch-npu support
setting `ASCEND_RT_VISIBLE_DEVICES` dynamically, thus we could use the
rank set in `DPEngineCoreProc` directly instead of calculating local
rank across dp by hand in the patched `_init_data_parallel`

Closes: https://github.com/vllm-project/vllm-ascend/issues/1170

2. Bump torch-npu version to 2.5.1.post1.dev20250528

Closes: https://github.com/vllm-project/vllm-ascend/pull/1242
Closes: https://github.com/vllm-project/vllm-ascend/issues/1232


### How was this patch tested?
CI passed with new added test.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Icey <1790571317@qq.com>
Co-authored-by: Icey <1790571317@qq.com>
2025-06-16 23:09:53 +08:00
Yikun Jiang
4976b48b98 [Build] Move numba/quart to requirments and update DS baseline and sync graph typo fix (#1121)
### What this PR does / why we need it?
1. The dependency was introduced by
https://github.com/vllm-project/vllm-ascend/pull/874
- Move numba/quart from requirements-dev to requirments
- Align pyproject.toml with requirements

2. This patch also fix deepseek accuracy baseline which
https://github.com/vllm-project/vllm-ascend/pull/1118 was not addressed.
According to https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite the
gsm8k is about `41.1`

3. This also sync the vLLM upstream changes:
eaa2e51088

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed
vllm ascend test (basic workflow)
vllm longterm test (spec decode)

Closes: https://github.com/vllm-project/vllm-ascend/issues/1120

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-06-08 22:33:37 +08:00
wangxiyuan
6193ba679b [CI] add codespell CI and fix format.sh (#827)
1. Fix format check error to make format.sh work
2. Add codespell check CI 
3. Add the missing required package for vllm-ascend.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-05-12 22:04:48 +08:00
Yikun Jiang
2e20797934 [BUILD] Upgrade torch-npu to 2.5.1 (#661)
### What this PR does / why we need it?
The torch-npu 2.5.1 are published:
https://pypi.org/project/torch-npu/2.5.1/
It's time to remove all torch-npu dev version from vllm-ascend code base

### Does this PR introduce _any_ user-facing change?
Yes, using torch-npu 2.5.1

### How was this patch tested?
- [ ] CI passed
- [ ] Manually test
- [ ] Grep all `dev2025`

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-04-27 17:28:29 +08:00
Bug Hunter Yan
05bdcbeae4 support aclgraph (#426)
<!--  Thanks for sending a pull request!

BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html

-->
### What this PR does / why we need it?
<!--
- Please clarify what changes you are proposing. The purpose of this
section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR.

- Please clarify why the changes are needed. For instance, the use case
and bug description.

- Fixes #
-->
This PR supports the access of vllm-acend to the piecewise_graph feature
provided by the v1 engine.

1. register unifiled_ascend_attention_with_output for piecewise_graph to
split graph.
2. support NPUGraph to accelerate kernel launch.

### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->
support npugraph to default, Users can disenable the npugraph feature by
configuring enforce_eager.

This has corresponding requirements for the versions of torch_npu and
CANN, and they need to support graph capture.

### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->
it turn to default

---------

Signed-off-by: Bug Hunter Yan <yanpq@zju.edu.cn>
Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>
Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com>
2025-04-23 20:56:24 +08:00
wangxiyuan
bbe7ccd366 [MISC] Add patch module (#526)
This PR added patch module for vllm
1. platform patch: the patch will be registered when load the platform
2. worker patch: the patch will be registered when worker is started.

The detail is:
1. patch_common: patch for main and 0.8.4 version
4. patch_main: patch for main verison
5. patch_0_8_4: patch for 0.8.4 version
2025-04-16 09:28:58 +08:00
wangxiyuan
9c7428b3d5 [CI] enable custom ops build (#466)
### What this PR does / why we need it?
This PR enable custom ops build  by default. 

### Does this PR introduce _any_ user-facing change?

Yes, users now install vllm-ascend from source will trigger custom ops
build step.

### How was this patch tested?
By image build and e2e CI

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-12 10:24:53 +08:00
Yikun Jiang
579d858a20 Set torchvision<0.21.0 to match torch/torch_npu version (#479)
### What this PR does / why we need it?
Set torchvision<0.21.0 to match torch/torch_npu version to resolve
`RuntimeError: operator torchvision::nms does not exist`.

Closes: https://github.com/vllm-project/vllm-ascend/issues/477

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-04-08 09:15:42 +08:00
Yikun Jiang
adabdeea7f Set numpy < 2.0.0 to resolve numpy VersionConflict (#476)
### What this PR does / why we need it?
vLLM bumps numpy version to 2.x:
8427f70493
, this will cause a
`pip._vendor.pkg_resources.ContextualVersionConflict: (numpy 2.2.4
(/usr/local/python3.10/lib/python3.10/site-packages),
Requirement.parse('numpy==1.26.4'), {'vllm-ascend'})` failure when vllm
ascend install. This PR resolved the issue by:
- Set numpy < 2.0.0 to resolve numpy VersionConflict
- Sync requirements and toml 
- Reorder


### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

Closes: https://github.com/vllm-project/vllm-ascend/issues/473

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-04-07 16:07:21 +08:00
Pleaplusone
ce8259975e [core] Support custom ascendc kernels in vllm-ascend (#233)
This PR add custom ascendc kernel rotary_embedding support in
vllm-ascend, related CMakeLists and setuptools is also added in this PR.

Related: https://github.com/vllm-project/vllm-ascend/issues/156

---------

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
2025-04-03 14:52:34 +08:00
Huazhong Ji
c8b57d10b2 [Misc] update the dependency version of torch-npu (#50)
### What this PR does / why we need it?

This PR updates the dependency version of vllm-ascend on torch-npu, so
that the vllm-ascend can be installed in a later version environment
(like to torch-npu 2.6.0rc1),

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI Test

Signed-off-by: ji-huazhong <hzji210@gmail.com>
2025-02-12 15:50:38 +08:00
wangxiyuan
c59375caff [Misc] version control by setuptools_scm (#21)
make package version control by setuptools_scm to keep the same with
vllm

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-02-10 09:36:09 +08:00