### What this PR does / why we need it?
This PR add docs of batch invariance and make some extra operators
according to validation result.
please see https://github.com/vllm-project/vllm-ascend/issues/5487 to
track progress.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- vLLM version: v0.16.0
- vLLM main:
15d76f74e2
---------
Signed-off-by: Ronald1995 <ronaldautomobile@163.com>
452 B
452 B
Feature Guide
This section provides a detailed usage guide of vLLM Ascend features.
:::{toctree} :caption: Feature Guide :maxdepth: 1 graph_mode quantization sleep_mode structured_output lora eplb_swift_balancer netloader Multi_Token_Prediction dynamic_batch kv_pool external_dp large_scale_ep ucm_deployment Fine_grained_TP layer_sharding speculative_decoding context_parallel npugraph_ex weight_prefetch sequence_parallelism batch_invariance :::