[v0.11.0][Doc] Update doc (#3852)

### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com>
2025-10-29 11:32:12 +08:00
parent 6188450269
commit 75de3fa172
49 changed files with 724 additions and 701 deletions
--- a/docs/source/user_guide/support_matrix/index.md
+++ b/docs/source/user_guide/support_matrix/index.md
@@ -1,6 +1,6 @@
-# Features and models
+# Features and Models

-This section provides a detailed supported matrix by vLLM Ascend.
+This section provides a detailed matrix supported by vLLM Ascend.

 :::{toctree}
 :caption: Support Matrix
--- a/docs/source/user_guide/support_matrix/supported_features.md
+++ b/docs/source/user_guide/support_matrix/supported_features.md
@@ -1,4 +1,4 @@
-# Feature Support
+# Supported Features

 The feature support principle of vLLM Ascend is: **aligned with the vLLM**. We are also actively collaborating with the community to accelerate support.

@@ -6,11 +6,11 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th

 | Feature                       |      Status    | Next Step                                                              |
 |-------------------------------|----------------|------------------------------------------------------------------------|
-| Chunked Prefill               | 🟢 Functional  | Functional, see detail note: [Chunked Prefill][cp]                     |
-| Automatic Prefix Caching      | 🟢 Functional  | Functional, see detail note: [vllm-ascend#732][apc]                    |
+| Chunked Prefill               | 🟢 Functional  | Functional, see detailed note: [Chunked Prefill][cp]                     |
+| Automatic Prefix Caching      | 🟢 Functional  | Functional, see detailed note: [vllm-ascend#732][apc]                    |
 | LoRA                          | 🟢 Functional  | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora]          |
 | Speculative decoding          | 🟢 Functional  | Basic support                                                          |
-| Pooling                       | 🟢 Functional  | CI needed and adapting more models; V1 support rely on vLLM support.   |
+| Pooling                       | 🟢 Functional  | CI needed to adapt to more models; V1 support rely on vLLM support.   |
 | Enc-dec                       | 🟡 Planned     | vLLM should support this feature first.                                |
 | Multi Modality                | 🟢 Functional  | [Tutorial][multimodal], optimizing and adapting more models            |
 | LogProbs                      | 🟢 Functional  | CI needed                                                              |
@@ -18,20 +18,20 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
 | Async output                  | 🟢 Functional  | CI needed                                                              |
 | Beam search                   | 🟢 Functional  | CI needed                                                              |
 | Guided Decoding               | 🟢 Functional  | [vllm-ascend#177][guided_decoding]                                     |
-| Tensor Parallel               | 🟢 Functional  | Make TP >4 work with graph mode                                        |
+| Tensor Parallel               | 🟢 Functional  | Make TP >4 work with graph mode.                                        |
 | Pipeline Parallel             | 🟢 Functional  | Write official guide and tutorial.                                     |
-| Expert Parallel               | 🟢 Functional  | Dynamic EPLB support.                                                  |
+| Expert Parallel               | 🟢 Functional  | Support dynamic EPLB.                                                  |
 | Data Parallel                 | 🟢 Functional  | Data Parallel support for Qwen3 MoE.                                   |
 | Prefill Decode Disaggregation | 🟢 Functional  | Functional, xPyD is supported.                                         |
-| Quantization                  | 🟢 Functional  | W8A8 available; working on more quantization method support(W4A8, etc) |
-| Graph Mode                    | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode]           |
+| Quantization                  | 🟢 Functional  | W8A8 available; working on more quantization method support (W4A8, etc) |
+| Graph Mode                    | 🔵 Experimental| Experimental, see detailed note: [vllm-ascend#767][graph_mode]           |
 | Sleep Mode                    | 🟢 Functional  |                                                                        |

 - 🟢 Functional: Fully operational, with ongoing optimizations.
 - 🔵 Experimental: Experimental support, interfaces and functions may change.
 - 🚧 WIP: Under active development, will be supported soon.
 - 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
- 🔴 NO plan / Deprecated: No plan or deprecated by vLLM.
+- 🔴 NO plan/Deprecated: No plan or deprecated by vLLM.

 [v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html
 [multimodal]: https://vllm-ascend.readthedocs.io/en/latest/tutorials/single_npu_multimodal.html
--- a/docs/source/user_guide/support_matrix/supported_models.md
+++ b/docs/source/user_guide/support_matrix/supported_models.md
@@ -1,20 +1,22 @@
-# Model Support
+# Supported Models

-Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/1608
+Get the latest info here: https://github.com/vllm-project/vllm-ascend/issues/1608

-## Text-only Language Models
+## Text-Only Language Models

 ### Generative Models

-| Model                         | Supported | Note                                                                 |
+| Model                         | Support   | Note                                                                 |
 |-------------------------------|-----------|----------------------------------------------------------------------|
-| DeepSeek v3                   | ✅        |                                                                      |
+| DeepSeek V3/3.1               | ✅        |                                                                      |
+| DeepSeek V3.2 EXP             | ✅        |                                                                      |
 | DeepSeek R1                   | ✅        |                                                                      |
 | DeepSeek Distill (Qwen/LLama) | ✅        |                                                                      |
 | Qwen3                         | ✅        |                                                                      |
 | Qwen3-based                   | ✅        |                                                                      |
 | Qwen3-Coder                   | ✅        |                                                                      |
 | Qwen3-Moe                     | ✅        |                                                                      |
+| Qwen3-Next                    | ✅        |                                                                      |
 | Qwen2.5                       | ✅        |                                                                      |
 | Qwen2                         | ✅        |                                                                      |
 | Qwen2-based                   | ✅        |                                                                      |
@@ -32,17 +34,17 @@ Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/160
 | Gemma-3                       | ✅        |                                                                      |
 | Phi-3/4                       | ✅        |                                                                      |
 | Mistral/Mistral-Instruct      | ✅        |                                                                      |
-| GLM-4.5                       | ✅            |                                                                  |
+| GLM-4.5                       | ✅        |                                                                      |
 | GLM-4                         | ❌        | [#2255](https://github.com/vllm-project/vllm-ascend/issues/2255)     |
 | GLM-4-0414                    | ❌        | [#2258](https://github.com/vllm-project/vllm-ascend/issues/2258)     |
 | ChatGLM                       | ❌        | [#554](https://github.com/vllm-project/vllm-ascend/issues/554)       |
-| DeepSeek v2.5                 | 🟡        | Need test                                                            |
+| DeepSeek V2.5                 | 🟡        | Need test                                                            |
 | Mllama                        | 🟡        | Need test                                                            |
 | MiniMax-Text                  | 🟡        | Need test                                                            |

 ### Pooling Models

-| Model                         | Supported | Note                                                                 |
+| Model                         | Support   | Note                                                                 |
 |-------------------------------|-----------|----------------------------------------------------------------------|
 | Qwen3-Embedding               | ✅        |                                                                      |
 | Molmo                         | ✅        | [1942](https://github.com/vllm-project/vllm-ascend/issues/1942)      |
@@ -52,10 +54,12 @@ Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/160

 ### Generative Models

-| Model                          | Supported     | Note                                                                 |
+| Model                          | Support       | Note                                                                 |
 |--------------------------------|---------------|----------------------------------------------------------------------|
 | Qwen2-VL                       | ✅            |                                                                      |
 | Qwen2.5-VL                     | ✅            |                                                                      |
+| Qwen3-VL                       | ✅            |                                                                      |
+| Qwen3-VL-MOE                   | ✅            |                                                                      |
 | Qwen2.5-Omni                   | ✅            | [1760](https://github.com/vllm-project/vllm-ascend/issues/1760)      |
 | QVQ                            | ✅            |                                                                      |
 | LLaVA 1.5/1.6                  | ✅            | [1962](https://github.com/vllm-project/vllm-ascend/issues/1962)      |
@@ -76,4 +80,4 @@ Get the newest info here: https://github.com/vllm-project/vllm-ascend/issues/160
 | GLM-4V                         | ❌            | [2260](https://github.com/vllm-project/vllm-ascend/issues/2260)      |
 | InternVL2.0/2.5/3.0<br>InternVideo2.5/Mono-InternVL | ❌ | [2064](https://github.com/vllm-project/vllm-ascend/issues/2064) |
 | Whisper                        | ❌            | [2262](https://github.com/vllm-project/vllm-ascend/issues/2262)      |
-| Ultravox                       | 🟡 Need test  |                                                                      |
+| Ultravox                       | 🟡            | Need test                                                            |