[DOC] update doc: LoRA with ACLGraph (#2430)
### What this PR does / why we need it?
Update DOC. Guide users to run LoRA with ACLGraph.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
No.
- vLLM version: v0.10.0
- vLLM main:
de7b67a023
---------
Signed-off-by: paulyu12 <507435917@qq.com>
This commit is contained in:
@@ -1,8 +1,23 @@
|
||||
# LoRA Adapters Guide
|
||||
|
||||
## Overview
|
||||
Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can be found in [vLLM official document](https://docs.vllm.ai/en/latest/features/lora.html).
|
||||
|
||||
You can also refer to [this](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.
|
||||
You can refer to [Supported Models](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.
|
||||
|
||||
## Tips
|
||||
If you fail to run vllm-ascend with LoRA, you may follow [this instruction](https://vllm-ascend.readthedocs.io/en/latest/user_guide/feature_guide/graph_mode.html#fallback-to-eager-mode) to disable graph mode and try again.
|
||||
You can run LoRA with ACLGraph mode now. Please refer to [Graph Mode Guide](./graph_mode.md) for a better LoRA performance.
|
||||
|
||||
## Example
|
||||
We show a simple LoRA example here, which enables the ACLGraph mode as default.
|
||||
|
||||
```shell
|
||||
vllm serve meta-llama/Llama-2-7b \
|
||||
--enable-lora \
|
||||
--lora-modules '{"name": "sql-lora", "path": "/path/to/lora", "base_model_name": "meta-llama/Llama-2-7b"}'
|
||||
```
|
||||
|
||||
## Custom LoRA Operators
|
||||
|
||||
We have implemented LoRA-related AscendC operators, such as bgmv_shrink, bgmv_expand, sgmv_shrink and sgmv_expand. You can find them under the "csrc/kernels" directory of [vllm-ascend repo](https://github.com/vllm-project/vllm-ascend.git).
|
||||
|
||||
When you install vllm and vllm-ascend, those operators mentioned above will be compiled and installed automatically. If you don't want to use AscendC operators when you run vllm-ascend, you should set `COMPILE_CUSTOM_KERNELS=0` and reinstall vllm-ascend. To require more instructions about installation and compilation, you can refer to [installation guide](../../installation.md).
|
||||
|
||||
Reference in New Issue
Block a user