435 B
435 B
Feature Guide
This section provides a detailed usage guide of vLLM Ascend features.
:::{toctree} :caption: Feature Guide :maxdepth: 1 graph_mode quantization sleep_mode structured_output lora eplb_swift_balancer netloader Multi_Token_Prediction dynamic_batch kv_pool external_dp large_scale_ep ucm_deployment Fine_grained_TP layer_sharding speculative_decoding context_parallel npugraph_ex weight_prefetch sequence_parallelism :::