xc-llm-ascend/docs/source/user_guide/feature_guide/lora.md

# LoRA Adapters Guide

## Overview
Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can be found in [vLLM official document](https://docs.vllm.ai/en/latest/features/lora.html).

You can refer to [Supported Models](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.

You can run LoRA with ACLGraph mode now. Please refer to [Graph Mode Guide](./graph_mode.md) for a better LoRA performance.

## Example
We provide a simple LoRA example here, which enables the ACLGraph mode by default.

```shell
vllm serve meta-llama/Llama-2-7b \
    --enable-lora \
    --lora-modules '{"name": "sql-lora", "path": "/path/to/lora", "base_model_name": "meta-llama/Llama-2-7b"}'
```

## Custom LoRA Operators

We have implemented LoRA-related AscendC operators, such as bgmv_shrink, bgmv_expand, sgmv_shrink and sgmv_expand. You can find them under the "csrc/kernels" directory of [vllm-ascend repo](https://github.com/vllm-project/vllm-ascend.git).

When you install vllm and vllm-ascend, those operators mentioned above will be compiled and installed automatically. If you do not want to use AscendC operators when you run vllm-ascend, you should set `COMPILE_CUSTOM_KERNELS=0` and reinstall vllm-ascend. To require more instructions about installation and compilation, you can refer to [installation guide](../../installation.md).
[Doc] Update user doc index (#1581) Add user doc index to make the user guide more clear - vLLM version: v0.9.1 - vLLM main: https://github.com/vllm-project/vllm/commit/49e8c7ea256bd48a36391b5bc72212af39278b67 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> 2025-07-10 14:26:59 +08:00			`# LoRA Adapters Guide`
[DOC] add LoRA user guide (#1265) ### What this PR does / why we need it? Add LoRA user guide to DOC. The content refers to [LoRA Adapters](https://docs.vllm.ai/en/latest/features/lora.html). ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-07-02 14:41:31 +08:00
[DOC] update doc: LoRA with ACLGraph (#2430) ### What this PR does / why we need it? Update DOC. Guide users to run LoRA with ACLGraph. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. - vLLM version: v0.10.0 - vLLM main: https://github.com/vllm-project/vllm/commit/de7b67a0232e35ae8e8ecd944aeddfc8cbc02631 --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-08-21 08:55:55 +08:00			`## Overview`
[DOC] Fix word spelling (#1595) ### What this PR does / why we need it? Fix word spelling in DOC. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. Signed-off-by: paulyu12 <507435917@qq.com> 2025-07-02 21:42:39 +08:00			`Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can be found in [vLLM official document](https://docs.vllm.ai/en/latest/features/lora.html).`
[DOC] add LoRA user guide (#1265) ### What this PR does / why we need it? Add LoRA user guide to DOC. The content refers to [LoRA Adapters](https://docs.vllm.ai/en/latest/features/lora.html). ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-07-02 14:41:31 +08:00
[DOC] update doc: LoRA with ACLGraph (#2430) ### What this PR does / why we need it? Update DOC. Guide users to run LoRA with ACLGraph. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. - vLLM version: v0.10.0 - vLLM main: https://github.com/vllm-project/vllm/commit/de7b67a0232e35ae8e8ecd944aeddfc8cbc02631 --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-08-21 08:55:55 +08:00			`You can refer to [Supported Models](https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-text-only-language-models) to find which models support LoRA in vLLM.`
[DOC] add LoRA user guide (#1265) ### What this PR does / why we need it? Add LoRA user guide to DOC. The content refers to [LoRA Adapters](https://docs.vllm.ai/en/latest/features/lora.html). ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-07-02 14:41:31 +08:00
[DOC] update doc: LoRA with ACLGraph (#2430) ### What this PR does / why we need it? Update DOC. Guide users to run LoRA with ACLGraph. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. - vLLM version: v0.10.0 - vLLM main: https://github.com/vllm-project/vllm/commit/de7b67a0232e35ae8e8ecd944aeddfc8cbc02631 --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-08-21 08:55:55 +08:00			`You can run LoRA with ACLGraph mode now. Please refer to [Graph Mode Guide](./graph_mode.md) for a better LoRA performance.`

			`## Example`
[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			`We provide a simple LoRA example here, which enables the ACLGraph mode by default.`
[DOC] update doc: LoRA with ACLGraph (#2430) ### What this PR does / why we need it? Update DOC. Guide users to run LoRA with ACLGraph. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. - vLLM version: v0.10.0 - vLLM main: https://github.com/vllm-project/vllm/commit/de7b67a0232e35ae8e8ecd944aeddfc8cbc02631 --------- Signed-off-by: paulyu12 <507435917@qq.com> 2025-08-21 08:55:55 +08:00
			```shell
			`vllm serve meta-llama/Llama-2-7b \`
			`--enable-lora \`
			`--lora-modules '{"name": "sql-lora", "path": "/path/to/lora", "base_model_name": "meta-llama/Llama-2-7b"}'`
			```

			`## Custom LoRA Operators`

			`We have implemented LoRA-related AscendC operators, such as bgmv_shrink, bgmv_expand, sgmv_shrink and sgmv_expand. You can find them under the "csrc/kernels" directory of [vllm-ascend repo](https://github.com/vllm-project/vllm-ascend.git).`

[v0.11.0][Doc] Update doc (#3852) ### What this PR does / why we need it? Update doc Signed-off-by: hfadzxy <starmoon_zhang@163.com> 2025-10-29 11:32:12 +08:00			When you install vllm and vllm-ascend, those operators mentioned above will be compiled and installed automatically. If you do not want to use AscendC operators when you run vllm-ascend, you should set `COMPILE_CUSTOM_KERNELS=0` and reinstall vllm-ascend. To require more instructions about installation and compilation, you can refer to [installation guide](../../installation.md).