Files
xc-llm-ascend/docs/source/user_guide/suppoted_features.md
wangxiyuan c99c4c8c70 [Doc] Update feature support list (#650)
1. remove Chinese doc. The content is out of data and we don't have
enough time to maintain it.
2. Update feature support matrix. Refresh the content and add V1 status.

---------

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-04-26 10:27:29 +08:00

4.6 KiB

Feature Support

The feature support principle of vLLM Ascend is: aligned with the vLLM. We are also actively collaborating with the community to accelerate support.

You can check the support status of vLLM V1 Engine. Below is the feature support status of vLLM Ascend:

Feature vLLM V0 Engine vLLM V1 Engine Next Step
Chunked Prefill 🚧 WIP 🚧 WIP Functional, waiting for CANN 8.1 nnal package release
Automatic Prefix Caching 🚧 WIP 🚧 WIP Functional, waiting for CANN 8.1 nnal package release
LoRA 🟢 Functional 🚧 WIP vllm-ascend#396, CI needed, working on V1 support
Prompt adapter No plan 🟡 Planned Plan in 2025.06.30
Speculative decoding 🟢 Functional 🚧 WIP CI needed; working on V1 support
Pooling 🟢 Functional 🟢 Functional CI needed and adapting more models; V1 support rely on vLLM support.
Enc-dec 🔴 NO plan 🟡 Planned Plan in 2025.06.30
Multi Modality 🟢 Functional 🟢 Functional Tutorial, optimizing and adapting more models
LogProbs 🟢 Functional 🟢 Functional CI needed
Prompt logProbs 🟢 Functional 🟢 Functional CI needed
Async output 🟢 Functional 🟢 Functional CI needed
Multi step scheduler 🟢 Functional 🔴 Deprecated vllm#8779, replaced by vLLM V1 Scheduler)
Best of 🟢 Functional 🔴 Deprecated vllm#13361, CI needed
Beam search 🟢 Functional 🟢 Functional CI needed
Guided Decoding 🟢 Functional 🟢 Functional vllm-ascend#177
Tensor Parallel 🟢 Functional 🟢 Functional CI needed
Pipeline Parallel 🟢 Functional 🟢 Functional CI needed
Expert Parallel 🔴 NO plan 🟢 Functional CI needed; No plan on V0 support
Data Parallel 🔴 NO plan 🟢 Functional CI needed; No plan on V0 support
Prefill Decode Disaggregation 🟢 Functional 🟢 Functional 1P1D available, working on xPyD and V1 support.
Quantization 🟢 Functional 🟢 Functional W8A8 available, CI needed; working on more quantization method support
Graph Mode 🔴 NO plan 🟢 Functional Functional, waiting for CANN 8.1 nnal package release
Sleep Mode 🟢 Functional 🟢 Functional level=1 available, CI needed, working on V1 support
  • 🟢 Functional: Fully operational, with ongoing optimizations.
  • 🚧 WIP: Under active development
  • 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
  • 🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1.