[DOC] update doc: LoRA with ACLGraph (#2430)

### What this PR does / why we need it? Update DOC. Guide users to run LoRA with ACLGraph. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. - vLLM version: v0.10.0 - vLLM main: de7b67a023 --------- Signed-off-by: paulyu12 <507435917@qq.com>
2025-08-21 08:55:55 +08:00
parent 0dca4c6dbd
commit 973a7cfdf0
2 changed files with 19 additions and 4 deletions
--- a/docs/source/installation.md
+++ b/docs/source/installation.md
@@ -171,7 +171,7 @@ vllm-ascend will build custom ops by default. If you don't want to build it, set

 ```{note}
 If you are building from v0.7.3-dev and intend to use sleep mode feature, you should set `COMPILE_CUSTOM_KERNELS=1` manually.
-To build custom ops, gcc/g++ higher than 8 and c++ 17 or higher is required. If you're using `pip install -e .` and encourage a torch-npu version conflict, please install with `pip install --no-build-isolation -e .` to build on system env.
+To build custom ops, gcc/g++ higher than 8 and c++ 17 or higher is required. If you're using `pip install -e .` and encounter a torch-npu version conflict, please install with `pip install --no-build-isolation -e .` to build on system env.
 If you encounter other problems during compiling, it is probably because unexpected compiler is being used, you may export `CXX_COMPILER` and `C_COMPILER` in env to specify your g++ and gcc locations before compiling.
 ```

--- a/docs/source/user_guide/feature_guide/lora.md
+++ b/docs/source/user_guide/feature_guide/lora.md
@@ -1,8 +1,23 @@
 # LoRA Adapters Guide

+## Overview
 Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can be found in [vLLM official document](https://docs.vllm.ai/en/latest/features/lora.html).

-You can also refer to [this](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.
+You can refer to [Supported Models](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.

-## Tips
-If you fail to run vllm-ascend with LoRA, you may follow [this instruction](https://vllm-ascend.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html#fallback-to-eager-mode) to disable graph mode and try again.
+You can run LoRA with ACLGraph mode now. Please refer to [Graph Mode Guide](./graph_mode.md) for a better LoRA performance.
+
+## Example
+We show a simple LoRA example here, which enables the ACLGraph mode as default.
+
+```shell
+vllm serve meta-llama/Llama-2-7b \
+    --enable-lora \
+    --lora-modules '{"name": "sql-lora", "path": "/path/to/lora", "base_model_name": "meta-llama/Llama-2-7b"}'
+```
+
+## Custom LoRA Operators
+
+We have implemented LoRA-related AscendC operators, such as bgmv_shrink, bgmv_expand, sgmv_shrink and sgmv_expand. You can find them under the "csrc/kernels" directory of [vllm-ascend repo](https://github.com/vllm-project/vllm-ascend.git).
+
+When you install vllm and vllm-ascend, those operators mentioned above will be compiled and installed automatically. If you don't want to use AscendC operators when you run vllm-ascend, you should set `COMPILE_CUSTOM_KERNELS=0` and reinstall vllm-ascend. To require more instructions about installation and compilation, you can refer to [installation guide](../../installation.md).