### What this PR does / why we need it? Update Feature Support doc. ### Does this PR introduce _any_ user-facing change? no. ### How was this patch tested? no. --------- Signed-off-by: Shanshan Shen <467638484@qq.com>
22 lines
880 B
Markdown
22 lines
880 B
Markdown
# Feature Support
|
|
|
|
| Feature | Supported | Note |
|
|
|---------|-----------|------|
|
|
| Chunked Prefill | ✗ | Plan in 2025 Q1 |
|
|
| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q2 |
|
|
| LoRA | ✗ | Plan in 2025 Q1 |
|
|
| Prompt adapter | ✗ | Plan in 2025 Q1 |
|
|
| Speculative decoding | ✗ | Plan in 2025 Q1 |
|
|
| Pooling | ✗ | Plan in 2025 Q2 |
|
|
| Enc-dec | ✗ | Plan in 2025 Q2 |
|
|
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
|
|
| LogProbs | ✅ ||
|
|
| Prompt logProbs | ✅ ||
|
|
| Async output | ✅ ||
|
|
| Multi step scheduler | ✗ | Plan in 2025 Q1 |
|
|
| Best of | ✅ ||
|
|
| Beam search | ✅ ||
|
|
| Guided Decoding | ✅ | Find more details at the [<u>issue</u>](https://github.com/vllm-project/vllm-ascend/issues/177) |
|
|
| Tensor Parallel | ✅ | Only "mp" supported now |
|
|
| Pipeline Parallel | ✅ | Only "mp" supported now |
|