Files

SILONG ZENG 4811ba62e0 [Lint]Style: reformat markdown files via markdownlint (#5884 )

### What this PR does / why we need it?
reformat markdown files via markdownlint

- vLLM version: v0.13.0
- vLLM main:
bde38c11df

---------

Signed-off-by: root <root@LAPTOP-VQKDDVMG.localdomain>
Signed-off-by: MrZ20 <2609716663@qq.com>
Co-authored-by: root <root@LAPTOP-VQKDDVMG.localdomain>

2026-01-15 09:06:01 +08:00

1.2 KiB

Raw Blame History

LoRA Adapters Guide

Overview

Like vLLM, vllm-ascend supports LoRA as well. The usage and more details can be found in vLLM official document.

You can refer to Supported Models to find which models support LoRA in vLLM.

You can run LoRA with ACLGraph mode now. Please refer to Graph Mode Guide for a better LoRA performance.

Address for downloading models:
base model: https://www.modelscope.cn/models/vllm-ascend/Llama-2-7b-hf/files
lora model: https://www.modelscope.cn/models/vllm-ascend/llama-2-7b-sql-lora-test/files

Example

We provide a simple LoRA example here, which enables the ACLGraph mode by default.

vllm serve meta-llama/Llama-2-7b \
    --enable-lora \
    --lora-modules '{"name": "sql-lora", "path": "/path/to/lora", "base_model_name": "meta-llama/Llama-2-7b"}'

Custom LoRA Operators

We have implemented LoRA-related AscendC operators, such as bgmv_shrink, bgmv_expand, sgmv_shrink and sgmv_expand. You can find them under the "csrc/kernels" directory of vllm-ascend repo.

1.2 KiB Raw Blame History

LoRA Adapters Guide

Overview

Example

Custom LoRA Operators

1.2 KiB

Raw Blame History