### What this PR does / why we need it? As part of the preparation work for the [RFC](https://github.com/vllm-project/vllm-ascend/issues/6214) We have added a documentation about npugraph_ex, which mainly explains and introduces its usage and FX graph optimization. The introduction to FX graph optimization also includes specific explanations of the default passes, the implementation methods for custom fusion passes, and how to capture the FX graph during the optimization process through environment variable configuration. --------- Signed-off-by: chencangtao <chencangtao@huawei.com> Co-authored-by: chencangtao <chencangtao@huawei.com>
392 B
392 B
Feature Guide
This section provides an overview of the features implemented in vLLM Ascend. Developers can refer to this guide to understand how vLLM Ascend works.
:::{toctree} :caption: Feature Guide :maxdepth: 1 patch ModelRunner_prepare_inputs disaggregated_prefill eplb_swift_balancer.md ACL_Graph KV_Cache_Pool_Guide add_custom_aclnn_op context_parallel quantization npugraph_ex :::