<!-- Thanks for sending a pull request!
BEFORE SUBMITTING, PLEASE READ
https://docs.vllm.ai/en/latest/contributing/overview.html
-->
### What this PR does / why we need it?
<!--
- Please clarify what changes you are proposing. The purpose of this
section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster
reviews in your PR.
- Please clarify why the changes are needed. For instance, the use case
and bug description.
- Fixes #
-->
Update DeepSeekOCR2.md for releases/v0.18.0
### Does this PR introduce _any_ user-facing change?
<!--
Note that it means *any* user-facing change including all aspects such
as API, interface or other behavior changes.
Documentation-only updates are not considered user-facing changes.
-->
NO
### How was this patch tested?
<!--
CI passed with new added/existing test.
If it was tested in a way different from regular unit tests, please
clarify how you tested step by step, ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future.
If tests were not added, please describe why they were not added and/or
why it was difficult to add.
-->
vLLM version: v0.18.0
vLLM main:
bcf2be9612
---------
Signed-off-by: Wangbei25 <wangbei41@huawie.com>
Signed-off-by: Wangbei25 <wangbei41@huawei.com>
Co-authored-by: Wangbei25 <wangbei41@huawie.com>
### What this PR does / why we need it?
1. This PR cherry pick commit that contains current best performance at
3.5k/1.5k and 128k/1k from main to 0.18.0 branch.
2. This PR introduce MiniMax-M2.7 0day information to users.
3. To finish previous step we also changes MiniMax doc name from
MiniMax-M2.5.md to MiniMax-M2.md
---------
Signed-off-by: limuyuan <limuyuan3@huawei.com>
Co-authored-by: limuyuan <limuyuan3@huawei.com>
What this PR does / why we need it?
This pull request performs a comprehensive cleanup of the vLLM Ascend
documentation. It fixes numerous typos, grammatical errors, and phrasing
issues across community guidelines, developer documents, hardware
tutorials, and feature guides. Key improvements include correcting
hardware names (e.g., Atlas 300I), fixing broken links, cleaning up code
examples (removing duplicate flags and trailing commas), and improving
the clarity of technical explanations. These changes are necessary to
ensure the documentation is professional, accurate, and easy for users
to follow.
Does this PR introduce any user-facing change?
No, this PR contains documentation-only updates.
How was this patch tested?
The changes were manually reviewed for accuracy and grammatical
correctness. No functional code changes were introduced.
---------
Signed-off-by: herizhen <1270637059@qq.com>
Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
### What this PR does / why we need it?
1. Add version notes for GLM5.
2. Add paramter modification for GLM5.
3. Add GLM5 to supported model list.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.18.0
- vLLM main:
35141a7eed
---------
Signed-off-by: yydyzr <liuyuncong1@huawei.com>
Signed-off-by: Zhu Jiyang <zhujiyang2@huawei.com>
Co-authored-by: Zhu Jiyang <zhujiyang2@huawei.com>
### What this PR does / why we need it?
Fix issues in the GLM4.7 documentation and add some missing
explanations.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
document test
- vLLM version: v0.17.0
- vLLM main:
8a680463fa
---------
Signed-off-by: zjks98 <zhangjiakang4@huawei.com>
Co-authored-by: zjks98 <zhangjiakang4@huawei.com>
## What this PR does / why we need it?
Fixes the broken URL for chunked-prefill in the supported features
documentation page.
The chunked prefill documentation URL was moved from
`performance/optimization.html` to `configuration/optimization.html` in
upstream vLLM docs. This PR updates the link to point to the correct
location.
**Before**:
https://docs.vllm.ai/en/stable/performance/optimization.html#chunked-prefill
(404)
**After**:
https://docs.vllm.ai/en/stable/configuration/optimization.html#chunked-prefill
(working)
## Does this PR introduce _any_ user-facing change?
Yes - fixes a broken documentation link that users encounter when
clicking 'Chunked Prefill' in the supported features page.
## How was this patch tested?
- Verified the new URL resolves correctly
- Documentation change only
Closes#4217
- vLLM version: v0.16.0
- vLLM main:
15d76f74e2
Signed-off-by: NJX-njx <3771829673@qq.com>
### What this PR does / why we need it?
Add Experimental supported model/feature for supported_models.md
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.16.0
- vLLM main:
15d76f74e2
Signed-off-by: zzzzwwjj <1183291235@qq.com>
### What this PR does / why we need it?
Update release note & support matrix to add experimental tag for
features and models.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.15.0
- vLLM main:
9562912cea
0.13.0 branch: https://github.com/vllm-project/vllm-ascend/pull/6751
Signed-off-by: zzzzwwjj <1183291235@qq.com>
### What this PR does / why we need it?
This PR refactors the tutorial documentation by restructuring it into
three categories: Models, Features, and Hardware. This improves the
organization and navigation of the tutorials, making it easier for users
to find relevant information.
- The single `tutorials/index.md` is split into three separate index
files:
- `docs/source/tutorials/models/index.md`
- `docs/source/tutorials/features/index.md`
- `docs/source/tutorials/hardwares/index.md`
- Existing tutorial markdown files have been moved into their respective
new subdirectories (`models/`, `features/`, `hardwares/`).
- The main `index.md` has been updated to link to these new tutorial
sections.
This change makes the documentation structure more logical and scalable
for future additions.
### Does this PR introduce _any_ user-facing change?
Yes, this PR changes the structure and URLs of the tutorial
documentation pages. Users following old links to tutorials will
encounter broken links. It is recommended to set up redirects if the
documentation framework supports them.
### How was this patch tested?
These are documentation-only changes. The documentation should be built
and reviewed locally to ensure all links are correct and the pages
render as expected.
- vLLM version: v0.15.0
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.15.0
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
### What this PR does / why we need it?
update supported features
- vLLM version: v0.13.0
- vLLM main:
d68209402d
Signed-off-by: hfadzxy <starmoon_zhang@163.com>
### What this PR does / why we need it?
Add docs for Qwen3-VL-Embedding & Qwen3-VL-Reranker.
- vLLM version: v0.13.0
- vLLM main:
2c24bc6996
---------
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
### What this PR does / why we need it?
Added legend descriptions, and split redundant tables into core
supported model tables and extended compatible model tables.
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.13.0
- vLLM main:
11b6af5280
---------
Signed-off-by: herizhen <1270637059@qq.com>
### What this PR does / why we need it?
Add GLM4.5 GLM4.6 doc
- vLLM version: v0.13.0
- vLLM main:
2f4e6548ef
Signed-off-by: 1092626063 <1092626063@qq.com>
### What this PR does / why we need it?
1. add PaddleOCR-VL.md in the `docs/source/tutorials/`
2. add PaddleOCR-VL index in `docs/source/tutorials/index.md`
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
by CI
- vLLM version: v0.13.0
- vLLM main:
7157596103
Signed-off-by: zouyizhou <zouyizhou@huawei.com>
### What this PR does / why we need it?
Add Qwen3-Omni-30B-A3B-Thinking Tutorials
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.13.0
- vLLM main:
5326c89803
---------
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
### What this PR does / why we need it?
Fix DeepSeek-V3.2 tutorial.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: menogrey <1299267905@qq.com>
### What this PR does / why we need it?
This PR provides an introduction to the Qwen3-VL-235B-A22B-Instruct
model, details on the features supported by the model in the current
version, the model deployment process, as well as methods for
performance testing and accuracy testing.
With this document, the deployment and testing of the
Qwen3-VL-235B-A22B-Instruct model can be implemented more easily.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: luluxiu520 <l2625793@outlook.com>
### What this PR does / why we need it?
Deepseekv3.1、DeepSeekR1 doc enhancement
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: 1092626063 <1092626063@qq.com>
### What this PR does / why we need it?
add qwen3 reranker tutorials
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.12.0
---------
Signed-off-by: TingW09 <944713709@qq.com>
### What this PR does / why we need it?
Correct more doc mistakes
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
### What this PR does / why we need it?
Correct mistakes in doc
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
### What this PR does / why we need it?
doc tutorials add model feature matrix:
DeepSeekR1
DeepSeekV3.1
Qwen3-Dense
Qwen3-Moe
Qwen3-Next
Qwen2.5
Qwen2.5-VL
Qwen3-VL
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: 1092626063 <1092626063@qq.com>
### What this PR does / why we need it?
This document employs the qwen3-vl-8b model and qwen2.5-vl-32b to
demonstrate the primary verification steps for the Qwen-VL series dense
models, including supported features, feature configuration, environment
preparation, NPU deployment, and accuracy and performance evaluation.
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: MrZ20 <2609716663@qq.com>
### What this PR does / why we need it?
Support pooling models (like `bge-reranker-v2-m3`) in vllm-ascend, this
pr covered the three model types of embed (cls_token, mean_token,
lasttoken).
After this
[commit](17373dcd93),
vllm has provided support for adapting pooling models on the v1 engine.
This PR includes corresponding adaptations on the vllm-ascend side.
Fixes#1960
- vLLM version: v0.12.0
- vLLM main:
ad32e3e19c
---------
Signed-off-by: lianyibo <lianyibo1@kunlunit.com>
Signed-off-by: MengqingCao <cmq0113@163.com>
Co-authored-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
Correct the mistake in information documents
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.11.0
- vLLM main:
2918c1b49c
---------
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
### What this PR does / why we need it?
First-generation model:uses"LLama",subsequent models use"Llama"
The second"L"here should be lowercase.Other instances of "LLama"on
this page should be corrected accordingly
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: herizhen <you@example.com>
Co-authored-by: herizhen <you@example.com>
### What this PR does / why we need it?
Corrected the errors in the information
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
ut
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: lilinsiman <lilinsiman@gmail.com>
### What this PR does / why we need it?
Add model feature matrix table.
- vLLM version: v0.11.0
- vLLM main:
83f478bb19
Signed-off-by: menogrey <1299267905@qq.com>
### What this PR does / why we need it?
Release note of v0.10.0rc1
- vLLM version: v0.10.0
- vLLM main:
8e8e0b6af1
---------
Signed-off-by: MengqingCao <cmq0113@163.com>
### What this PR does / why we need it?
Update user guide for suported models
- vLLM version: v0.10.0
- vLLM main:
4be02a3776
---------
Signed-off-by: hfadzxy <starmoon_zhang@163.com>
### What this PR does / why we need it?
1. Enable pymarkdown check
2. Enable python `__init__.py` check for vllm and vllm-ascend
3. Make clean code
### How was this patch tested?
- vLLM version: v0.9.2
- vLLM main:
29c6fbe58c
---------
Signed-off-by: wangli <wangli858794774@gmail.com>
The feature support matrix is out of date. This PR refresh the content.
- vLLM version: v0.9.2
- vLLM main:
107111a859
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Add user doc index to make the user guide more clear
- vLLM version: v0.9.1
- vLLM main:
49e8c7ea25
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>