[Doc] update supported features (#6165)

### What this PR does / why we need it?

update supported features


- vLLM version: v0.13.0
- vLLM main:
d68209402d

Signed-off-by: hfadzxy <starmoon_zhang@163.com>
This commit is contained in:
zhangxinyuehfad
2026-01-23 09:50:11 +08:00
committed by GitHub
parent 819a4459ce
commit 08a45e6053

View File

@@ -10,11 +10,11 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
|-------------------------------|----------------|------------------------------------------------------------------------| |-------------------------------|----------------|------------------------------------------------------------------------|
| Chunked Prefill | 🟢 Functional | Functional, see detailed note: [Chunked Prefill][cp] | | Chunked Prefill | 🟢 Functional | Functional, see detailed note: [Chunked Prefill][cp] |
| Automatic Prefix Caching | 🟢 Functional | Functional, see detailed note: [vllm-ascend#732][apc] | | Automatic Prefix Caching | 🟢 Functional | Functional, see detailed note: [vllm-ascend#732][apc] |
| LoRA | 🟢 Functional | [vllm-ascend#396][multilora], [vllm-ascend#893][v1 multilora] | | LoRA | 🟢 Functional | Functional, see detailed note: [LoRA][LoRA] |
| Speculative decoding | 🟢 Functional | Basic support | | Speculative decoding | 🟢 Functional | Basic support |
| Pooling | 🟢 Functional | CI needed to adapt to more models; V1 support relies on vLLM support. | | Pooling | 🟢 Functional | CI needed to adapt to more models; V1 support relies on vLLM support. |
| Enc-dec | 🟡 Planned | vLLM should support this feature first. | | Enc-dec | 🟡 Planned | vLLM should support this feature first. |
| Multi Modality | 🟢 Functional | [Tutorial][multimodal], optimizing and adapting more models | | Multi Modality | 🟢 Functional | [Multi Modality][multimodal], optimizing and adapting more models |
| LogProbs | 🟢 Functional | CI needed | | LogProbs | 🟢 Functional | CI needed |
| Prompt logProbs | 🟢 Functional | CI needed | | Prompt logProbs | 🟢 Functional | CI needed |
| Async output | 🟢 Functional | CI needed | | Async output | 🟢 Functional | CI needed |
@@ -26,9 +26,9 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
| Data Parallel | 🟢 Functional | Data Parallel support for Qwen3 MoE. | | Data Parallel | 🟢 Functional | Data Parallel support for Qwen3 MoE. |
| Prefill Decode Disaggregation | 🟢 Functional | Functional, xPyD is supported. | | Prefill Decode Disaggregation | 🟢 Functional | Functional, xPyD is supported. |
| Quantization | 🟢 Functional | W8A8 available; working on more quantization method support (W4A8, etc) | | Quantization | 🟢 Functional | W8A8 available; working on more quantization method support (W4A8, etc) |
| Graph Mode | 🔵 Experimental| Experimental, see detailed note: [vllm-ascend#767][graph_mode] | | Graph Mode | 🟢 Functional | Functional, see detailed note: [Graph Mode][graph_mode] |
| Sleep Mode | 🟢 Functional | | | Sleep Mode | 🟢 Functional | Functional, see detailed note: [Sleep Mode][sleep_mode] |
| Context Parallel | 🔵 Experimental| | | Context Parallel | 🟢 Functional | Functional, see detailed note: [Context Parallel][context_parallel] |
- 🟢 Functional: Fully operational, with ongoing optimizations. - 🟢 Functional: Fully operational, with ongoing optimizations.
- 🔵 Experimental: Experimental support, interfaces and functions may change. - 🔵 Experimental: Experimental support, interfaces and functions may change.
@@ -37,12 +37,12 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
- 🔴 NO plan/Deprecated: No plan or deprecated by vLLM. - 🔴 NO plan/Deprecated: No plan or deprecated by vLLM.
[v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html [v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html
[multimodal]: https://docs.vllm.ai/projects/ascend/en/latest/tutorials/single_npu_multimodal.html [multimodal]: https://docs.vllm.ai/projects/ascend/en/latest/tutorials/Qwen-VL-Dense.html
[guided_decoding]: https://github.com/vllm-project/vllm-ascend/issues/177 [guided_decoding]: https://github.com/vllm-project/vllm-ascend/issues/177
[multilora]: https://github.com/vllm-project/vllm-ascend/issues/396 [LoRA]: https://docs.vllm.ai/projects/ascend/en/latest/user_guide/feature_guide/lora.html
[v1 multilora]: https://github.com/vllm-project/vllm-ascend/pull/893 [graph_mode]: https://docs.vllm.ai/projects/ascend/en/latest/user_guide/feature_guide/graph_mode.html
[graph_mode]: https://github.com/vllm-project/vllm-ascend/issues/767
[apc]: https://github.com/vllm-project/vllm-ascend/issues/732 [apc]: https://github.com/vllm-project/vllm-ascend/issues/732
[cp]: https://docs.vllm.ai/en/stable/performance/optimization.html#chunked-prefill [cp]: https://docs.vllm.ai/en/stable/performance/optimization.html#chunked-prefill
[1P1D]: https://github.com/vllm-project/vllm-ascend/pull/950 [1P1D]: https://github.com/vllm-project/vllm-ascend/pull/950
[ray]: https://github.com/vllm-project/vllm-ascend/issues/1751 [context_parallel]: https://docs.vllm.ai/projects/ascend/en/latest/user_guide/feature_guide/context_parallel.html
[sleep_mode]: https://docs.vllm.ai/projects/ascend/en/latest/user_guide/feature_guide/sleep_mode.html