[doc] fix feature support (#70)
Check and update the feature support table. - both multi-step and speculative decoding require adaptation of corresponding workers - prompt adapter (finetune method) require adaption in worker.py and model_runner.py Signed-off-by: MengqingCao <cmq0113@163.com>
This commit is contained in:
@@ -3,17 +3,19 @@
|
|||||||
| Feature | Supported | Note |
|
| Feature | Supported | Note |
|
||||||
|---------|-----------|------|
|
|---------|-----------|------|
|
||||||
| Chunked Prefill | ✗ | Plan in 2025 Q1 |
|
| Chunked Prefill | ✗ | Plan in 2025 Q1 |
|
||||||
| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q1 |
|
| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q2 |
|
||||||
| LoRA | ✗ | Plan in 2025 Q1 |
|
| LoRA | ✗ | Plan in 2025 Q1 |
|
||||||
| Prompt adapter | ✅ ||
|
| Prompt adapter | ✗ | Plan in 2025 Q1 |
|
||||||
| Speculative decoding | ✅ | Improve accuracy in 2025 Q1|
|
| Speculative decoding | ✗ | Plan in 2025 Q1 |
|
||||||
| Pooling | ✗ | Plan in 2025 Q1 |
|
| Pooling | ✗ | Plan in 2025 Q2 |
|
||||||
| Enc-dec | ✗ | Plan in 2025 Q1 |
|
| Enc-dec | ✗ | Plan in 2025 Q2 |
|
||||||
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
|
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
|
||||||
| LogProbs | ✅ ||
|
| LogProbs | ✅ ||
|
||||||
| Prompt logProbs | ✅ ||
|
| Prompt logProbs | ✅ ||
|
||||||
| Async output | ✅ ||
|
| Async output | ✅ ||
|
||||||
| Multi step scheduler | ✅ ||
|
| Multi step scheduler | ✗ | Plan in 2025 Q1 |
|
||||||
| Best of | ✅ ||
|
| Best of | ✅ ||
|
||||||
| Beam search | ✅ ||
|
| Beam search | ✅ ||
|
||||||
| Guided Decoding | ✗ | Plan in 2025 Q1 |
|
| Guided Decoding | ✗ | Plan in 2025 Q1 |
|
||||||
|
| Tensor Parallel | ✅ | Only "mp" supported now |
|
||||||
|
| Pipeline Parallel | ✅ | Only "mp" supported now |
|
||||||
|
|||||||
Reference in New Issue
Block a user