Files

Mengqing Cao c935b7006c [doc] fix feature support (#70 )

Check and update the feature support table.

- both multi-step and speculative decoding require adaptation of corresponding workers
- prompt adapter (finetune method) require adaption in worker.py and model_runner.py

Signed-off-by: MengqingCao <cmq0113@163.com>

2025-02-17 15:43:37 +08:00

800 B

Raw Blame History

Feature Support

Feature	Supported	Note
Chunked Prefill	✗	Plan in 2025 Q1
Automatic Prefix Caching	✅	Improve performance in 2025 Q2
LoRA	✗	Plan in 2025 Q1
Prompt adapter	✗	Plan in 2025 Q1
Speculative decoding	✗	Plan in 2025 Q1
Pooling	✗	Plan in 2025 Q2
Enc-dec	✗	Plan in 2025 Q2
Multi Modality	✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)	Add more model support in 2025 Q1
LogProbs	✅
Prompt logProbs	✅
Async output	✅
Multi step scheduler	✗	Plan in 2025 Q1
Best of	✅
Beam search	✅
Guided Decoding	✗	Plan in 2025 Q1
Tensor Parallel	✅	Only "mp" supported now
Pipeline Parallel	✅	Only "mp" supported now

800 B Raw Blame History

Feature Support

800 B

Raw Blame History