From c935b7006c305357e6d8f329309b5ddbad43f975 Mon Sep 17 00:00:00 2001 From: Mengqing Cao Date: Mon, 17 Feb 2025 15:43:37 +0800 Subject: [PATCH] [doc] fix feature support (#70) Check and update the feature support table. - both multi-step and speculative decoding require adaptation of corresponding workers - prompt adapter (finetune method) require adaption in worker.py and model_runner.py Signed-off-by: MengqingCao --- docs/source/features/suppoted_features.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/docs/source/features/suppoted_features.md b/docs/source/features/suppoted_features.md index b13bbb2..0c65a1b 100644 --- a/docs/source/features/suppoted_features.md +++ b/docs/source/features/suppoted_features.md @@ -3,17 +3,19 @@ | Feature | Supported | Note | |---------|-----------|------| | Chunked Prefill | ✗ | Plan in 2025 Q1 | -| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q1 | +| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q2 | | LoRA | ✗ | Plan in 2025 Q1 | -| Prompt adapter | ✅ || -| Speculative decoding | ✅ | Improve accuracy in 2025 Q1| -| Pooling | ✗ | Plan in 2025 Q1 | -| Enc-dec | ✗ | Plan in 2025 Q1 | +| Prompt adapter | ✗ | Plan in 2025 Q1 | +| Speculative decoding | ✗ | Plan in 2025 Q1 | +| Pooling | ✗ | Plan in 2025 Q2 | +| Enc-dec | ✗ | Plan in 2025 Q2 | | Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 | | LogProbs | ✅ || | Prompt logProbs | ✅ || | Async output | ✅ || -| Multi step scheduler | ✅ || +| Multi step scheduler | ✗ | Plan in 2025 Q1 | | Best of | ✅ || | Beam search | ✅ || | Guided Decoding | ✗ | Plan in 2025 Q1 | +| Tensor Parallel | ✅ | Only "mp" supported now | +| Pipeline Parallel | ✅ | Only "mp" supported now |