42 Commits

Author SHA1 Message Date
zhangxinyuehfad
75de3fa172 [v0.11.0][Doc] Update doc (#3852)
### What this PR does / why we need it?
Update doc


Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-10-29 11:32:12 +08:00
leo-pony
291c00a224 [Doc] pin version that can stable running 310I Duo to vllm-ascend v0.10.0rc1 (#3455)
Pin version that can stable running 310I Duo to vllm-ascend v0.10.0rc1.

### What this PR does / why we need it?
Since PR #2614 310I Duo been broken. Although we are currently working
on fixing the issue with the 310I Duo being broken, there is no
confirmed timeline for a fix in the short term. To allow users to
quickly find a working version instead of going back and forth on trial
and error, this PR fixes the version in the 310I Duo guide.

### Does this PR introduce _any_ user-facing change?
NA

### How was this patch tested?
NA

- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-10-16 08:54:09 +08:00
wangxiyuan
00ba071022 [Doc] Release note for v0.11.0rc0 (#3224)
### What this PR does / why we need it?
Add release note for v0.11.0rc0

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?


- vLLM version: v0.11.0rc3
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-09-30 03:26:18 +08:00
weiguihua2
065486820b [Doc] add faqs:install vllm-ascend will overwrite existing torch-npu (#3245)
### What this PR does / why we need it?
add faqs:install vllm-ascend will overwrite existing torch-npu

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.10.2
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0

Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
2025-09-29 12:02:23 +08:00
lilinsiman
1705501ae2 [BugFix] Fix ACLgraph bug in Qwen3_32b_int8 case (#3204)
### What this PR does / why we need it?
1. Solved the issue where sizes capture failed for the Qwen3-32b-int8
model when aclgraph, dp1, and tp4 were enabled.
2. Added the exception thrown when sizes capture fails and provided a
solution
3. Add this common problem to the FAQ doc
### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
ut

- vLLM version: v0.10.2
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0

Signed-off-by: lilinsiman <lilinsiman@gmail.com>
2025-09-28 17:44:04 +08:00
wangxiyuan
048bfd5553 [Release] Add release note for v0.10.2rc1 (#2921)
Add release note for v0.10.2rc1

- vLLM version: v0.10.2
- vLLM main:
b834b4cbf1

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-09-16 01:20:05 +08:00
Mengqing Cao
7e16b4a7cd [ReleaseNote] Add Release Note for v0.10.1rc1 (#2635)
Add Release Note for v0.10.1rc1

- vLLM version: v0.10.1.1
- vLLM main:
b5ee1e3261

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-09-04 11:26:47 +08:00
wangxiyuan
41b028aa5f [Doc] add v0.9.1 release note (#2646)
Add release note for 0.9.1

- vLLM version: v0.10.1.1
- vLLM main:
8bd5844989

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-09-03 18:04:27 +08:00
Shanshan Shen
98c68220c1 [Doc] Update v0.9.1rc3 doc (#2512)
### What this PR does / why we need it?
Update `v0.9.1rc3` doc, which are supplements to
https://github.com/vllm-project/vllm-ascend/pull/2488.

- vLLM version: v0.10.0
- vLLM main:
170e8ea9ea

Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com>
2025-08-25 11:39:29 +08:00
jack
8bfd16a145 [Doc] Add container image save/load FAQ for offline environments (#2347)
### What this PR does / why we need it?

Add Docker export/import guide for air-gapped environments

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

NA

- vLLM version: v0.10.0
- vLLM main:
d16aa3dae4

Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
2025-08-13 16:00:43 +08:00
Mengqing Cao
49ec6c98b7 [Doc] Update faq (#2334)
### What this PR does / why we need it?
  - update determinitic calculation
  - update support device

### Does this PR introduce _any_ user-facing change?
- Users should update ray and protobuf when using ray as distributed
backend
- Users should change to use `export HCCL_DETERMINISTIC=true` when
enabling determinitic calculation

### How was this patch tested?
N/A

- vLLM version: v0.10.0
- vLLM main:
ea1292ad3e

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-08-12 14:12:53 +08:00
Mengqing Cao
4604882a3e [ReleaseNote] Release note of v0.10.0rc1 (#2225)
### What this PR does / why we need it?
Release note of v0.10.0rc1

- vLLM version: v0.10.0
- vLLM main:
8e8e0b6af1

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-08-07 14:46:49 +08:00
Yikun Jiang
54ace9e12b Add release note for v0.9.1rc2 (#2188)
### What this PR does / why we need it?
Add release note for v0.9.1rc2

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

- vLLM version: v0.10.0
- vLLM main:
c494f96fbc

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-08-06 09:04:46 +08:00
JohnJan
54f2b31184 [Doc] Add a doc for qwen omni (#1867)
Signed-off-by: wuzhongjian <wuzhongjian_yewu@cmss.chinamobile.com>

### What this PR does / why we need it?
Add FAQ note for qwen omni
Fixes https://github.com/vllm-project/vllm-ascend/issues/1760 issue1



- vLLM version: v0.9.2
- vLLM main:
b9a21e9173
2025-07-20 09:05:41 +08:00
wangxiyuan
9c560b009a [Release] Add 0.9.2rc1 release note (#1725)
Add release note for 0.9.2rc1, we'll release soon









- vLLM version: v0.9.2
- vLLM main:
7bd4c37ae7

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-07-11 17:36:05 +08:00
Mengqing Cao
c1c5d56255 [Doc] Update FAQ and add test guidance (#1360)
### What this PR does / why we need it?
- Add test guidance
- Add reduce layer guidance
- update faq on determinitic calculation

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-06-25 09:59:23 +08:00
weiguihua2
e1123172d1 [Doc] Add reinstall instructions doc (#1303)
Add a new FAQ, if users re-install vllm-ascend with pip, the `build`
folder should be removed first

---------

Signed-off-by: rjg-lyh <1318825571@qq.com>
Signed-off-by: weiguihua <weiguihua2@huawei.com>
Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
2025-06-23 14:06:27 +08:00
xleoken
4447e53d7a [Doc] Change not to no in faqs.md (#1357)
### What this PR does / why we need it?

Change not to no in faqs.md.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

Local Test

Signed-off-by: xleoken <xleoken@163.com>
2025-06-23 09:01:00 +08:00
Yikun Jiang
c30ddb8331 Bump v0.9.1rc1 release (#1349)
### What this PR does / why we need it?
Bump v0.9.1rc1 release

Closes: https://github.com/vllm-project/vllm-ascend/pull/1341
Closes: https://github.com/vllm-project/vllm-ascend/pull/1334

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed


---------

Signed-off-by: Shanshan Shen <87969357+shen-shanshan@users.noreply.github.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: shen-shanshan <467638484@qq.com>
2025-06-22 13:15:36 +08:00
Mengqing Cao
8dd686dfa2 [MLA][Graph] Improve assertion on Graph mode with MLA (#933)
### What this PR does / why we need it?
Improve assertion on Graph mode with MLA.

When running deepseek with graph mode, the fused MLA op only support
`numHeads / numKvHeads ∈ {32, 64, 128}`, thus we improve the assertion
info here to avoid users confused with this.

### Does this PR introduce _any_ user-facing change?
Adjusting tp size is required when running deepseek-v3/r1 with graph
mode. deepseek-v2-lite is not supported in graph mode.

### How was this patch tested?
Test locally as the CI machine could not run V3 due to the HBM limits.

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-06-10 22:26:53 +08:00
wangxiyuan
b75cb788dd [Bugfix] add compilation/__init__.py to fix import error (#1152)
1. Add `__init__.py` for vllm_ascend/compilation to make sure it's a
python module
2. Fix model runner bug to keep the same with vllm
3. Add release note for 0.9.0rc2

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-10 17:14:25 +08:00
wangxiyuan
5ac4872f5e [Doc] Add 0.9.0rc1 release note (#1106)
Add the release note for v0.9.0rc1

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-06-09 19:39:21 +08:00
wangxiyuan
5903547d09 [doc] add 0.7.3.post1 release note (#1008)
Add release note for 0.7.3.post1
Add the missing release note back for 0.7.3

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-05-29 17:38:34 +08:00
wangxiyuan
6193ba679b [CI] add codespell CI and fix format.sh (#827)
1. Fix format check error to make format.sh work
2. Add codespell check CI 
3. Add the missing required package for vllm-ascend.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-05-12 22:04:48 +08:00
zzzzwwjj
5301649108 [Doc] Add notes for OOM in FAQs (#786)
### What this PR does / why we need it?
add notes for OOM in faqs.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

---------

Signed-off-by: zzzzwwjj <1183291235@qq.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-05-08 16:28:29 +08:00
Yikun Jiang
ec27af346a [Doc] Add 0.8.5rc1 release note (#756)
### What this PR does / why we need it?
Add 0.8.5rc1 release note and bump vllm version to v0.8.5.post1

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

CI passed

---------

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-05-06 23:46:35 +08:00
wangxiyuan
0dae55a9a3 [MISC] fix format check error (#654)
This pr makes format.sh works as expect.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-29 11:14:19 +08:00
wangxiyuan
5995d23532 [Doc] Add 0.8.4rc2 release note (#705)
Add 0.8.4rc2 release note

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-28 21:51:35 +08:00
Li Wang
58f9d932d3 [Doc] Update faqs (#699)
### What this PR does / why we need it?
Update faqs to make it more clear


Signed-off-by: wangli <wangli858794774@gmail.com>
2025-04-28 18:48:23 +08:00
wangxiyuan
5de3646522 [MISC] Make vllm version configurable (#651)
Sometimes, user install a dev/editable version of vllm. In this case, we
should make sure vllm-ascend works as well.

This PR add a new env `VLLM_VERSION`. It's used for developers who edit
vllm. In this case, developers should set thie env to make sure which
vllm version is installed and used.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-28 14:19:06 +08:00
Mengqing Cao
b91f9a5afd [Doc][Build] Update build doc and faq (#568)
Update build doc and faq about deepseek w8a8

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-04-18 14:16:41 +08:00
wangxiyuan
e66ded5679 [Doc] Add release note for 0.8.4rc1 (#557)
Add release note for 0.8.4rc1, we'll release 0.8.4rc1 now.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-18 13:24:36 +08:00
Shanshan Shen
7eeff60715 [Doc] Update FAQ doc (#561)
### What this PR does / why we need it?
Update FAQ doc to make `docker pull` more clear


Signed-off-by: shen-shanshan <467638484@qq.com>
2025-04-18 13:13:13 +08:00
Li Wang
64fdf4cbef [Doc]Update faq (#536)
### What this PR does / why we need it?
update performance and accuracy faq

Signed-off-by: wangli <wangli858794774@gmail.com>
2025-04-17 14:56:51 +08:00
hfadzxy
00de2ee6ad [Doc] update faq about progress bar display issue (#538)
### What this PR does / why we need it?
update faq about progress bar display issue

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-04-16 16:07:08 +08:00
Mengqing Cao
fe13cd9ea5 [Doc] update faq about w8a8 (#534)
update faq about w8a8

---------

Signed-off-by: Mengqing Cao <cmq0113@163.com>
2025-04-16 09:37:21 +08:00
wangxiyuan
bbe7ccd366 [MISC] Add patch module (#526)
This PR added patch module for vllm
1. platform patch: the patch will be registered when load the platform
2. worker patch: the patch will be registered when worker is started.

The detail is:
1. patch_common: patch for main and 0.8.4 version
4. patch_main: patch for main verison
5. patch_0_8_4: patch for 0.8.4 version
2025-04-16 09:28:58 +08:00
wangxiyuan
5c6d79687c [Doc] Update FAQ (#518)
Update FAQ

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-04-15 10:17:56 +08:00
Shanshan Shen
11ecbfdb31 [Doc] Update FAQ doc (#504)
### What this PR does / why we need it?
Update FAQ doc.
---------

Signed-off-by: shen-shanshan <467638484@qq.com>
2025-04-14 11:11:40 +08:00
wangxiyuan
ca8b1c3e47 [Doc] Add 0.7.3rc2 release note (#419)
Add 0.7.3rc2 release note. We'll release 0.7.3rc2 right now.

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-03-29 09:02:08 +08:00
wangxiyuan
c25631ec7b [Doc] Add the release note for 0.7.3rc1 (#285)
Add the release note for 0.7.3rc1

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-03-13 17:57:06 +08:00
Yikun Jiang
cff08f9df8 [Doc] Add initial FAQs (#247)
### What this PR does / why we need it?
Add initial FAQs

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Preview

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-03-06 10:42:42 +08:00