Files
xc-llm-ascend/docs/source/features/suppoted_features.md
Mengqing Cao c935b7006c [doc] fix feature support (#70)
Check and update the feature support table.

- both multi-step and speculative decoding require adaptation of corresponding workers
- prompt adapter (finetune method) require adaption in worker.py and model_runner.py

Signed-off-by: MengqingCao <cmq0113@163.com>
2025-02-17 15:43:37 +08:00

800 B

Feature Support

Feature Supported Note
Chunked Prefill Plan in 2025 Q1
Automatic Prefix Caching Improve performance in 2025 Q2
LoRA Plan in 2025 Q1
Prompt adapter Plan in 2025 Q1
Speculative decoding Plan in 2025 Q1
Pooling Plan in 2025 Q2
Enc-dec Plan in 2025 Q2
Multi Modality (LLaVA/Qwen2-vl/Qwen2-audio/internVL) Add more model support in 2025 Q1
LogProbs
Prompt logProbs
Async output
Multi step scheduler Plan in 2025 Q1
Best of
Beam search
Guided Decoding Plan in 2025 Q1
Tensor Parallel Only "mp" supported now
Pipeline Parallel Only "mp" supported now