Add EPD doc and load-balance proxy example
- vLLM version: v0.14.0
- vLLM main:
d68209402d
---------
Signed-off-by: 李少鹏 <lishaopeng21@huawei.com>
483 B
483 B
Feature Guide
This section provides a detailed usage guide of vLLM Ascend features.
:::{toctree} :caption: Feature Guide :maxdepth: 1 graph_mode cpu_binding quantization sleep_mode structured_output lora eplb_swift_balancer netloader Multi_Token_Prediction dynamic_batch epd_disaggregation kv_pool external_dp large_scale_ep ucm_deployment Fine_grained_TP layer_sharding speculative_decoding context_parallel npugraph_ex weight_prefetch sequence_parallelism batch_invariance :::