### What this PR does / why we need it?
This PR adds a load-balance dp proxy server which can be used in
external DP scenario without Disaggregated-Prefill enabled. What's more,
add a doc of external dp and load-balance dp proxy server.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
See the new doc.
- vLLM version: v0.11.0
- vLLM main:
2918c1b49c
---------
Signed-off-by: whx-sjtu <2952154980@qq.com>
273 B
273 B
Feature Guide
This section provides a detailed usage guide of vLLM Ascend features.
:::{toctree} :caption: Feature Guide :maxdepth: 1 graph_mode quantization sleep_mode structured_output lora eplb_swift_balancer netloader dynamic_batch kv_pool_mooncake external_dp :::