[Doc]Add chinese doc (#10)
### What this PR does / why we need it? This PR adds Chinese documents for vllm-ascend for Chinese-speaking developers ### Does this PR introduce _any_ user-facing change? Change as follows - add README.zh.md - add environment.zh.md - add CONTRIBUTING.zh.md ### How was this patch tested? By CI --------- Signed-off-by: wangli <wangli858794774@gmail.com>
This commit is contained in:
102
CONTRIBUTING.zh.md
Normal file
102
CONTRIBUTING.zh.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# 为 vLLM 昇腾插件贡献
|
||||
|
||||
## 构建与测试
|
||||
我们推荐您在提交PR之前在本地开发环境进行构建和测试。
|
||||
|
||||
### 环境准备与构建
|
||||
理论上,vllm-ascend 构建仅支持 Linux,因为`vllm-ascend` 依赖项 `torch_npu` 仅支持 Linux。
|
||||
|
||||
但是您仍然可以在 Linux/Windows/macOS 上配置开发环境进行代码检查和基本测试,如下命令所示:
|
||||
|
||||
```bash
|
||||
# 选择基础文件夹 (~/vllm-project/) ,创建python虚拟环境
|
||||
cd ~/vllm-project/
|
||||
python3 -m venv .venv
|
||||
source ./.venv/bin/activate
|
||||
|
||||
# 克隆并安装vllm
|
||||
git clone https://github.com/vllm-project/vllm.git
|
||||
cd vllm
|
||||
pip install -r requirements-build.txt
|
||||
VLLM_TARGET_DEVICE="empty" pip install .
|
||||
cd ..
|
||||
|
||||
# 克隆并安装vllm-ascend
|
||||
git clone https://github.com/vllm-project/vllm-ascend.git
|
||||
cd vllm-ascend
|
||||
pip install -r requirements-dev.txt
|
||||
|
||||
# 通过执行以下脚本以运行 lint 及 mypy 测试
|
||||
bash format.sh
|
||||
|
||||
# 构建:
|
||||
# - 目前仅支持在Linux上进行完整构建(torch_npu 限制)
|
||||
# pip install -e .
|
||||
# - 在其他操作系统上构建安装,需要跳过依赖
|
||||
# - build without deps for debugging in other OS
|
||||
# pip install -e . --no-deps
|
||||
|
||||
# 使用 `-s` 提交更改
|
||||
git commit -sm "your commit info"
|
||||
```
|
||||
|
||||
### 测试
|
||||
虽然 vllm-ascend CI 提供了对 [Ascend](.github/workflows/vllm_ascend_test.yaml) 的集成测试,但您也可以在本地运行它。在本地运行这些集成测试的最简单方法是通过容器:
|
||||
|
||||
```bash
|
||||
# 基于昇腾NPU环境
|
||||
git clone https://github.com/vllm-project/vllm-ascend.git
|
||||
cd vllm-ascend
|
||||
|
||||
IMAGE=vllm-ascend-dev-image
|
||||
CONTAINER_NAME=vllm-ascend-dev
|
||||
DEVICE=/dev/davinci1
|
||||
|
||||
# 首次构建会花费10分钟(10MB/s)下载基础镜像和包
|
||||
docker build -t $IMAGE -f ./Dockerfile .
|
||||
# 您还可以通过设置 VLLM_REPO 来指定镜像仓库以加速
|
||||
# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm
|
||||
|
||||
docker run --name $CONTAINER_NAME --network host --device $DEVICE \
|
||||
--device /dev/davinci_manager --device /dev/devmm_svm \
|
||||
--device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \
|
||||
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
|
||||
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
|
||||
-ti --rm $IMAGE bash
|
||||
|
||||
cd vllm-ascend
|
||||
pip install -r requirements-dev.txt
|
||||
|
||||
pytest tests/
|
||||
```
|
||||
|
||||
## 开发者来源证书(DCO)
|
||||
|
||||
在向本项目提交贡献时,您必须同意 DCO。提交必须包含“Signed-off-by:”标头,以证明同意 DCO 的条款。
|
||||
|
||||
在`git commit`时使用`-s`将会自动添加该标头。
|
||||
|
||||
## PR 标题和分类
|
||||
|
||||
仅特定类型的 PR 会被审核。PR 标题会以适当的前缀来表明变更类型。请使用以下之一:
|
||||
|
||||
- `[Attention]` 关于`attention`的新特性或优化
|
||||
- `[Communicator]` 关于`communicators`的新特性或优化
|
||||
- `[ModelRunner]` 关于`model runner`的新特性或优化
|
||||
- `[Platform]` 关于`platform`的新特性或优化
|
||||
- `[Worker]` 关于`worker`的新特性或优化
|
||||
- `[Core]` 关于`vllm-ascend`核心逻辑 (如 `platform, attention, communicators, model runner`)的新特性或优化
|
||||
- `[Kernel]` 影响计算内核和操作的更改.
|
||||
- `[Bugfix]` bug修复
|
||||
- `[Doc]` 文档的修复与更新
|
||||
- `[Test]` 测试 (如:单元测试)
|
||||
- `[CI]` 构建或持续集成改进
|
||||
- `[Misc]` 适用于更改内容对于上述类别均不适用的PR,请谨慎使用该前缀
|
||||
|
||||
> [!注意]
|
||||
> 如果 PR 涉及多个类别,请添加所有相关前缀
|
||||
|
||||
## 其他
|
||||
|
||||
您可以在 [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html) 上找到更多有关为 vLLM 昇腾插件贡献的信息。
|
||||
如果您在贡献过程中发现任何问题,您可以随时提交 PR 来改进文档以帮助其他开发人员。
|
||||
@@ -14,6 +14,10 @@ vLLM Ascend Plugin
|
||||
| <a href="https://www.hiascend.com/en/"><b>About Ascend</b></a> | <a href="https://slack.vllm.ai"><b>Developer Slack (#sig-ascend)</b></a> |
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<a ><b>English</b></a> | <a href="README.zh.md"><b>中文</b></a>
|
||||
</p>
|
||||
|
||||
---
|
||||
*Latest News* 🔥
|
||||
|
||||
|
||||
151
README.zh.md
Normal file
151
README.zh.md
Normal file
@@ -0,0 +1,151 @@
|
||||
<p align="center">
|
||||
<picture>
|
||||
<!-- TODO: Replace tmp link to logo url after vllm-projects/vllm-ascend ready -->
|
||||
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/user-attachments/assets/4a958093-58b5-4772-a942-638b51ced646">
|
||||
<img alt="vllm-ascend" src="https://github.com/user-attachments/assets/838afe2f-9a1d-42df-9758-d79b31556de0" width=55%>
|
||||
</picture>
|
||||
</p>
|
||||
|
||||
<h3 align="center">
|
||||
vLLM Ascend Plugin
|
||||
</h3>
|
||||
|
||||
<p align="center">
|
||||
| <a href="https://www.hiascend.com/en/"><b>关于昇腾</b></a> | <a href="https://slack.vllm.ai"><b>开发者 Slack (#sig-ascend)</b></a> |
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<a href="README.md"><b>English</b></a> | <a><b>中文</b></a>
|
||||
</p>
|
||||
|
||||
---
|
||||
*最新消息* 🔥
|
||||
|
||||
- [2024/12] 我们正在与 vLLM 社区合作,以支持 [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162).
|
||||
---
|
||||
## 总览
|
||||
|
||||
vLLM 昇腾插件 (`vllm-ascend`) 是一个让vLLM在Ascend NPU无缝运行的后端插件。
|
||||
|
||||
此插件是 vLLM 社区中支持昇腾后端的推荐方式。它遵循[[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162)所述原则:通过解耦的方式提供了vLLM对Ascend NPU的支持。
|
||||
|
||||
使用 vLLM 昇腾插件,可以让类Transformer、混合专家(MOE)、嵌入、多模态等流行的大语言模型在 Ascend NPU 上无缝运行。
|
||||
|
||||
## 前提
|
||||
### 支持的设备
|
||||
- Atlas A2 训练系列 (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
|
||||
- Atlas 800I A2 推理系列 (Atlas 800I A2)
|
||||
|
||||
### 依赖
|
||||
| 需求 | 支持的版本 | 推荐版本 | 注意 |
|
||||
|-------------|-------------------| ----------- |------------------------------------------|
|
||||
| vLLM | main | main | vllm-ascend 依赖 |
|
||||
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | vllm 依赖 |
|
||||
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | vllm-ascend and torch-npu 依赖 |
|
||||
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | vllm-ascend 依赖 |
|
||||
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | torch-npu and vllm 依赖 |
|
||||
|
||||
在[此处](docs/environment.zh.md)了解更多如何配置您环境的信息。
|
||||
|
||||
## 开始使用
|
||||
|
||||
> [!NOTE]
|
||||
> 目前,我们正在积极与 vLLM 社区合作以支持 Ascend 后端插件,一旦支持,您可以使用一行命令: `pip install vllm vllm-ascend` 来完成安装。
|
||||
|
||||
通过源码安装:
|
||||
```bash
|
||||
# 安装vllm main 分支参考文档:
|
||||
# https://docs.vllm.ai/en/latest/getting_started/installation/cpu/index.html#build-wheel-from-source
|
||||
git clone --depth 1 https://github.com/vllm-project/vllm.git
|
||||
cd vllm
|
||||
pip install -r requirements-build.txt
|
||||
VLLM_TARGET_DEVICE=empty pip install .
|
||||
|
||||
# 安装vllm-ascend main 分支
|
||||
git clone https://github.com/vllm-project/vllm-ascend.git
|
||||
cd vllm-ascend
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
运行如下命令使用 [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) 模型启动服务:
|
||||
|
||||
```bash
|
||||
# 设置环境变量 VLLM_USE_MODELSCOPE=true 加速下载
|
||||
vllm serve Qwen/Qwen2.5-0.5B-Instruct
|
||||
curl http://localhost:8000/v1/models
|
||||
```
|
||||
|
||||
请参阅 [vLLM 快速入门](https://docs.vllm.ai/en/latest/getting_started/quickstart.html)以获取更多详细信息。
|
||||
|
||||
## 构建
|
||||
|
||||
#### 从源码构建Python包
|
||||
|
||||
```bash
|
||||
git clone https://github.com/vllm-project/vllm-ascend.git
|
||||
cd vllm-ascend
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
#### 构建容器镜像
|
||||
```bash
|
||||
git clone https://github.com/vllm-project/vllm-ascend.git
|
||||
cd vllm-ascend
|
||||
docker build -t vllm-ascend-dev-image -f ./Dockerfile .
|
||||
```
|
||||
|
||||
查看[构建和测试](./CONTRIBUTING.zh.md)以获取更多详细信息,其中包含逐步指南,帮助您设置开发环境、构建和测试。
|
||||
|
||||
## 特性支持矩阵
|
||||
| Feature | Supported | Note |
|
||||
|---------|-----------|------|
|
||||
| Chunked Prefill | ✗ | Plan in 2025 Q1 |
|
||||
| Automatic Prefix Caching | ✅ | Imporve performance in 2025 Q1 |
|
||||
| LoRA | ✗ | Plan in 2025 Q1 |
|
||||
| Prompt adapter | ✅ ||
|
||||
| Speculative decoding | ✅ | Impore accuracy in 2025 Q1|
|
||||
| Pooling | ✗ | Plan in 2025 Q1 |
|
||||
| Enc-dec | ✗ | Plan in 2025 Q1 |
|
||||
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
|
||||
| LogProbs | ✅ ||
|
||||
| Prompt logProbs | ✅ ||
|
||||
| Async output | ✅ ||
|
||||
| Multi step scheduler | ✅ ||
|
||||
| Best of | ✅ ||
|
||||
| Beam search | ✅ ||
|
||||
| Guided Decoding | ✗ | Plan in 2025 Q1 |
|
||||
|
||||
## 模型支持矩阵
|
||||
|
||||
此处展示了部分受支持的模型。有关更多详细信息,请参阅 [supported_models](docs/supported_models.md):
|
||||
| Model | Supported | Note |
|
||||
|---------|-----------|------|
|
||||
| Qwen 2.5 | ✅ ||
|
||||
| Mistral | | Need test |
|
||||
| DeepSeek v2.5 | |Need test |
|
||||
| LLama3.1/3.2 | ✅ ||
|
||||
| Gemma-2 | |Need test|
|
||||
| baichuan | |Need test|
|
||||
| minicpm | |Need test|
|
||||
| internlm | ✅ ||
|
||||
| ChatGLM | ✅ ||
|
||||
| InternVL 2.5 | ✅ ||
|
||||
| Qwen2-VL | ✅ ||
|
||||
| GLM-4v | |Need test|
|
||||
| Molomo | ✅ ||
|
||||
| LLaVA 1.5 | ✅ ||
|
||||
| Mllama | |Need test|
|
||||
| LLaVA-Next | |Need test|
|
||||
| LLaVA-Next-Video | |Need test|
|
||||
| Phi-3-Vison/Phi-3.5-Vison | |Need test|
|
||||
| Ultravox | |Need test|
|
||||
| Qwen2-Audio | ✅ ||
|
||||
|
||||
|
||||
## 贡献
|
||||
我们欢迎并重视任何形式的贡献与合作:
|
||||
- 请通过[提交问题](https://github.com/vllm-project/vllm-ascend/issues)来告知我们您遇到的任何错误。
|
||||
- 请参阅 [CONTRIBUTING.zh.md](./CONTRIBUTING.zh.md) 中的贡献指南。
|
||||
## 许可证
|
||||
|
||||
Apache 许可证 2.0,如 [LICENSE](./LICENSE) 文件中所示。
|
||||
38
docs/environment.zh.md
Normal file
38
docs/environment.zh.md
Normal file
@@ -0,0 +1,38 @@
|
||||
### 昇腾NPU环境准备
|
||||
|
||||
### 依赖
|
||||
| 需求 | 支持的版本 | 推荐版本 | 注意 |
|
||||
|-------------|-------------------| ----------- |------------------------------------------|
|
||||
| vLLM | main | main | vllm-ascend 依赖 |
|
||||
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | vllm 依赖 |
|
||||
| CANN | >= 8.0.RC2 | [8.0.RC3](https://www.hiascend.com/developer/download/community/result?module=cann&cann=8.0.0.beta1) | vllm-ascend and torch-npu 依赖 |
|
||||
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | vllm-ascend 依赖 |
|
||||
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | torch-npu and vllm 依赖 |
|
||||
|
||||
|
||||
以下为安装推荐版本软件的简短说明:
|
||||
|
||||
#### 容器化安装
|
||||
|
||||
您可以直接使用[容器镜像](https://hub.docker.com/r/ascendai/cann),只需一行命令即可:
|
||||
|
||||
```bash
|
||||
docker run \
|
||||
--name vllm-ascend-env \
|
||||
--device /dev/davinci1 \
|
||||
--device /dev/davinci_manager \
|
||||
--device /dev/devmm_svm \
|
||||
--device /dev/hisi_hdc \
|
||||
-v /usr/local/dcmi:/usr/local/dcmi \
|
||||
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
|
||||
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
|
||||
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
|
||||
-v /etc/ascend_install.info:/etc/ascend_install.info \
|
||||
-it quay.io/ascend/cann:8.0.rc3.beta1-910b-ubuntu22.04-py3.10 bash
|
||||
```
|
||||
|
||||
您无需手动安装 `torch` 和 `torch_npu` ,它们将作为 `vllm-ascend` 依赖项自动安装。
|
||||
|
||||
#### 手动安装
|
||||
|
||||
您也可以选择手动安装,按照[昇腾安装指南](https://ascend.github.io/docs/sources/ascend/quick_install.html)中提供的说明配置环境。
|
||||
Reference in New Issue
Block a user