Pin version that can stable running 310I Duo to vllm-ascend v0.10.0rc1.
### What this PR does / why we need it?
Since PR #2614 310I Duo been broken. Although we are currently working
on fixing the issue with the 310I Duo being broken, there is no
confirmed timeline for a fix in the short term. To allow users to
quickly find a working version instead of going back and forth on trial
and error, this PR fixes the version in the 310I Duo guide.
### Does this PR introduce _any_ user-facing change?
NA
### How was this patch tested?
NA
- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0
---------
Signed-off-by: leo-pony <nengjunma@outlook.com>
### What this PR does / why we need it?
add faqs:install vllm-ascend will overwrite existing torch-npu
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.10.2
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0
Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
### What this PR does / why we need it?
1. Solved the issue where sizes capture failed for the Qwen3-32b-int8
model when aclgraph, dp1, and tp4 were enabled.
2. Added the exception thrown when sizes capture fails and provided a
solution
3. Add this common problem to the FAQ doc
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.10.2
- vLLM main:
https://github.com/vllm-project/vllm/commit/releases/v0.11.0
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
### What this PR does / why we need it?
Add Docker export/import guide for air-gapped environments
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
NA
- vLLM version: v0.10.0
- vLLM main:
d16aa3dae4
Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com>
### What this PR does / why we need it?
- update determinitic calculation
- update support device
### Does this PR introduce _any_ user-facing change?
- Users should update ray and protobuf when using ray as distributed
backend
- Users should change to use `export HCCL_DETERMINISTIC=true` when
enabling determinitic calculation
### How was this patch tested?
N/A
- vLLM version: v0.10.0
- vLLM main:
ea1292ad3e
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
Release note of v0.10.0rc1
- vLLM version: v0.10.0
- vLLM main:
8e8e0b6af1
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
Add release note for v0.9.1rc2
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
- vLLM version: v0.10.0
- vLLM main:
c494f96fbc
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Add a new FAQ, if users re-install vllm-ascend with pip, the `build`
folder should be removed first
---------
Signed-off-by: rjg-lyh <1318825571@qq.com>
Signed-off-by: weiguihua <weiguihua2@huawei.com>
Signed-off-by: weiguihua2 <weiguihua2@huawei.com>
### What this PR does / why we need it?
Change not to no in faqs.md.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Local Test
Signed-off-by: xleoken <xleoken@163.com>
### What this PR does / why we need it?
Improve assertion on Graph mode with MLA.
When running deepseek with graph mode, the fused MLA op only support
`numHeads / numKvHeads ∈ {32, 64, 128}`, thus we improve the assertion
info here to avoid users confused with this.
### Does this PR introduce _any_ user-facing change?
Adjusting tp size is required when running deepseek-v3/r1 with graph
mode. deepseek-v2-lite is not supported in graph mode.
### How was this patch tested?
Test locally as the CI machine could not run V3 due to the HBM limits.
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
1. Add `__init__.py` for vllm_ascend/compilation to make sure it's a
python module
2. Fix model runner bug to keep the same with vllm
3. Add release note for 0.9.0rc2
---------
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
1. Fix format check error to make format.sh work
2. Add codespell check CI
3. Add the missing required package for vllm-ascend.
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
add notes for OOM in faqs.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
---------
Signed-off-by: zzzzwwjj <1183291235@qq.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it?
Add 0.8.5rc1 release note and bump vllm version to v0.8.5.post1
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
CI passed
---------
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Sometimes, user install a dev/editable version of vllm. In this case, we
should make sure vllm-ascend works as well.
This PR add a new env `VLLM_VERSION`. It's used for developers who edit
vllm. In this case, developers should set thie env to make sure which
vllm version is installed and used.
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This PR added patch module for vllm
1. platform patch: the patch will be registered when load the platform
2. worker patch: the patch will be registered when worker is started.
The detail is:
1. patch_common: patch for main and 0.8.4 version
4. patch_main: patch for main verison
5. patch_0_8_4: patch for 0.8.4 version
### What this PR does / why we need it?
Add initial FAQs
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Preview
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>