[Doc] Add sphinx build for vllm-ascend (#55)

### What this PR does / why we need it?

This patch enables the doc build for vllm-ascend

- Add sphinx build for vllm-ascend
- Enable readthedocs for vllm-ascend
- Fix CI:
- exclude vllm-empty/tests/mistral_tool_use to skip `You need to agree
to share your contact information to access this model` which introduce
in
314cfade02
- Install test req to fix
https://github.com/vllm-project/vllm-ascend/actions/runs/13304112758/job/37151690770:
      ```
      vllm-empty/tests/mistral_tool_use/conftest.py:4: in <module>
          import pytest_asyncio
      E   ModuleNotFoundError: No module named 'pytest_asyncio'
      ```
  - exclude docs PR

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
1. test locally:
    ```bash
    # Install dependencies.
    pip install -r requirements-docs.txt
    
    # Build the docs and preview
    make clean; make html; python -m http.server -d build/html/
    ```
    
    Launch browser and open http://localhost:8000/.

2. CI passed with preview:
    https://vllm-ascend--55.org.readthedocs.build/en/55/

Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
Yikun Jiang
2025-02-13 18:44:17 +08:00
committed by GitHub
parent 63b11ec7e9
commit 46977f9f06
22 changed files with 255 additions and 36 deletions

21
docs/Makefile Normal file
View File

@@ -0,0 +1,21 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

21
docs/README.md Normal file
View File

@@ -0,0 +1,21 @@
# vLLM Ascend Plugin documents
## Build the docs
```bash
# Install dependencies.
pip install -r requirements-docs.txt
# Build the docs.
make clean
make html
```
## Open the docs with your browser
```bash
python -m http.server -d build/html/
```
Launch your browser and open http://localhost:8000/.

View File

@@ -1,15 +0,0 @@
# Ascend plugin for vLLM
vLLM Ascend plugin (vllm-ascend) is a community maintained hardware plugin for running vLLM on the Ascend NPU.
This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162), providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.
By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.
## Contents
- [Quick Start](./quick_start.md)
- [Installation](./installation.md)
- Usage
- [Running vLLM with Ascend](./usage/running_vllm_with_ascend.md)
- [Feature Support](./usage/feature_support.md)
- [Supported Models](./usage/supported_models.md)

View File

@@ -0,0 +1,8 @@
sphinx==6.2.1
sphinx-argparse==0.4.0
sphinx-book-theme==1.0.1
sphinx-copybutton==0.5.2
sphinx-design==0.6.1
sphinx-togglebutton==0.3.2
myst-parser==3.0.1
msgspec

View File

@@ -0,0 +1,2 @@
pytest-asyncio

105
docs/source/conf.py Normal file
View File

@@ -0,0 +1,105 @@
#
# Copyright (c) 2025 Huawei Technologies Co., Ltd. All Rights Reserved.
# This file is a part of the vllm-ascend project.
# Adapted from vllm-project/vllm/docs/source/conf.py
# Copyright 2023 The vLLM team.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
# -- Project information -----------------------------------------------------
project = 'vllm-ascend'
copyright = '2025, vllm-ascend team'
author = 'the vllm-ascend team'
# The full version, including alpha/beta/rc tags
release = ''
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
# Copy from https://github.com/vllm-project/vllm/blob/main/docs/source/conf.py
extensions = [
"sphinx.ext.napoleon",
"sphinx.ext.intersphinx",
"sphinx_copybutton",
"sphinx.ext.autodoc",
"sphinx.ext.autosummary",
"myst_parser",
"sphinxarg.ext",
"sphinx_design",
"sphinx_togglebutton",
]
myst_enable_extensions = [
"colon_fence",
]
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = 'en'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = [
'_build',
'Thumbs.db',
'.DS_Store',
'.venv',
'README.md',
# TODO(yikun): Remove this after zh supported
'developer_guide/contributing.zh.md'
]
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_title = project
html_theme = 'sphinx_book_theme'
html_logo = 'logos/vllm-ascend-logo-text-light.png'
html_theme_options = {
'path_to_docs': 'docs/source',
'repository_url': 'https://github.com/vllm-project/vllm-ascend',
'use_repository_button': True,
'use_edit_page_button': True,
}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
# html_static_path = ['_static']
def setup(app):
pass

View File

@@ -0,0 +1,107 @@
# Contributing
## Building and testing
It's recommended to set up a local development environment to build and test
before you submit a PR.
### Prepare environment and build
Theoretically, the vllm-ascend build is only supported on Linux because
`vllm-ascend` dependency `torch_npu` only supports Linux.
But you can still set up dev env on Linux/Windows/macOS for linting and basic
test as following commands:
```bash
# Choose a base dir (~/vllm-project/) and set up venv
cd ~/vllm-project/
python3 -m venv .venv
source ./.venv/bin/activate
# Clone vllm code and install
git clone https://github.com/vllm-project/vllm.git
cd vllm
pip install -r requirements-build.txt
VLLM_TARGET_DEVICE="empty" pip install .
cd ..
# Clone vllm-ascend and install
git clone https://github.com/vllm-project/vllm-ascend.git
cd vllm-ascend
pip install -r requirements-dev.txt
# Then you can run lint and mypy test
bash format.sh
# Build:
# - only supported on Linux (torch_npu available)
# pip install -e .
# - build without deps for debugging in other OS
# pip install -e . --no-deps
# Commit changed files using `-s`
git commit -sm "your commit info"
```
### Testing
Although vllm-ascend CI provide integration test on [Ascend](https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml), you can run it
locally. The simplest way to run these integration tests locally is through a container:
```bash
# Under Ascend NPU environment
git clone https://github.com/vllm-project/vllm-ascend.git
cd vllm-ascend
IMAGE=vllm-ascend-dev-image
CONTAINER_NAME=vllm-ascend-dev
DEVICE=/dev/davinci1
# The first build will take about 10 mins (10MB/s) to download the base image and packages
docker build -t $IMAGE -f ./Dockerfile .
# You can also specify the mirror repo via setting VLLM_REPO to speedup
# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm
docker run --name $CONTAINER_NAME --network host --device $DEVICE \
--device /dev/davinci_manager --device /dev/devmm_svm \
--device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-ti --rm $IMAGE bash
cd vllm-ascend
pip install -r requirements-dev.txt
pytest tests/
```
## DCO and Signed-off-by
When contributing changes to this project, you must agree to the DCO. Commits must include a `Signed-off-by:` header which certifies agreement with the terms of the DCO.
Using `-s` with `git commit` will automatically add this header.
## PR Title and Classification
Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
- `[Attention]` for new features or optimization in attention.
- `[Communicator]` for new features or optimization in communicators.
- `[ModelRunner]` for new features or optimization in model runner.
- `[Platform]` for new features or optimization in platform.
- `[Worker]` for new features or optimization in worker.
- `[Core]` for new features or optimization in the core vllm-ascend logic (such as platform, attention, communicators, model runner)
- `[Kernel]` changes affecting compute kernels and ops.
- `[Bugfix]` for bug fixes.
- `[Doc]` for documentation fixes and improvements.
- `[Test]` for tests (such as unit tests).
- `[CI]` for build or continuous integration improvements.
- `[Misc]` for PRs that do not fit the above categories. Please use this sparingly.
> [!NOTE]
> If the PR spans more than one category, please include all relevant prefixes.
## Others
You may find more information about contributing to vLLM Ascend backend plugin on [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html).
If you find any problem when contributing, you can feel free to submit a PR to improve the doc to help other developers.

View File

@@ -0,0 +1,102 @@
# 贡献指南
## 构建与测试
我们推荐您在提交PR之前在本地开发环境进行构建和测试。
### 环境准备与构建
理论上vllm-ascend 构建仅支持 Linux因为`vllm-ascend` 依赖项 `torch_npu` 仅支持 Linux。
但是您仍然可以在 Linux/Windows/macOS 上配置开发环境进行代码检查和基本测试,如下命令所示:
```bash
# 选择基础文件夹 (~/vllm-project/) 创建python虚拟环境
cd ~/vllm-project/
python3 -m venv .venv
source ./.venv/bin/activate
# 克隆并安装vllm
git clone https://github.com/vllm-project/vllm.git
cd vllm
pip install -r requirements-build.txt
VLLM_TARGET_DEVICE="empty" pip install .
cd ..
# 克隆并安装vllm-ascend
git clone https://github.com/vllm-project/vllm-ascend.git
cd vllm-ascend
pip install -r requirements-dev.txt
# 通过执行以下脚本以运行 lint 及 mypy 测试
bash format.sh
# 构建:
# - 目前仅支持在Linux上进行完整构建torch_npu 限制)
# pip install -e .
# - 在其他操作系统上构建安装,需要跳过依赖
# - build without deps for debugging in other OS
# pip install -e . --no-deps
# 使用 `-s` 提交更改
git commit -sm "your commit info"
```
### 测试
虽然 vllm-ascend CI 提供了对 [Ascend](https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml) 的集成测试,但您也可以在本地运行它。在本地运行这些集成测试的最简单方法是通过容器:
```bash
# 基于昇腾NPU环境
git clone https://github.com/vllm-project/vllm-ascend.git
cd vllm-ascend
IMAGE=vllm-ascend-dev-image
CONTAINER_NAME=vllm-ascend-dev
DEVICE=/dev/davinci1
# 首次构建会花费10分钟10MB/s下载基础镜像和包
docker build -t $IMAGE -f ./Dockerfile .
# 您还可以通过设置 VLLM_REPO 来指定镜像仓库以加速
# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm
docker run --name $CONTAINER_NAME --network host --device $DEVICE \
--device /dev/davinci_manager --device /dev/devmm_svm \
--device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-ti --rm $IMAGE bash
cd vllm-ascend
pip install -r requirements-dev.txt
pytest tests/
```
## 开发者来源证书(DCO)
在向本项目提交贡献时,您必须同意 DCO。提交必须包含“Signed-off-by:”标头,以证明同意 DCO 的条款。
`git commit`时使用`-s`将会自动添加该标头。
## PR 标题和分类
仅特定类型的 PR 会被审核。PR 标题会以适当的前缀来表明变更类型。请使用以下之一:
- `[Attention]` 关于`attention`的新特性或优化
- `[Communicator]` 关于`communicators`的新特性或优化
- `[ModelRunner]` 关于`model runner`的新特性或优化
- `[Platform]` 关于`platform`的新特性或优化
- `[Worker]` 关于`worker`的新特性或优化
- `[Core]` 关于`vllm-ascend`核心逻辑 (如 `platform, attention, communicators, model runner`)的新特性或优化
- `[Kernel]` 影响计算内核和操作的更改.
- `[Bugfix]` bug修复
- `[Doc]` 文档的修复与更新
- `[Test]` 测试 (如:单元测试)
- `[CI]` 构建或持续集成改进
- `[Misc]` 适用于更改内容对于上述类别均不适用的PR请谨慎使用该前缀
> [!注意]
> 如果 PR 涉及多个类别,请添加所有相关前缀
## 其他
您可以在 [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html) 上找到更多有关为 vLLM 昇腾插件贡献的信息。
如果您在贡献过程中发现任何问题,您可以随时提交 PR 来改进文档以帮助其他开发人员。

53
docs/source/index.md Normal file
View File

@@ -0,0 +1,53 @@
# Welcome to vLLM Ascend Plugin
:::{figure} ./logos/vllm-ascend-logo-text-light.png
:align: center
:alt: vLLM
:class: no-scaled-link
:width: 70%
:::
:::{raw} html
<p style="text-align:center">
<strong>vLLM Ascend Plugin
</strong>
</p>
<p style="text-align:center">
<script async defer src="https://buttons.github.io/buttons.js"></script>
<a class="github-button" href="https://github.com/vllm-project/vllm-ascend" data-show-count="true" data-size="large" aria-label="Star">Star</a>
<a class="github-button" href="https://github.com/vllm-project/vllm-ascend/subscription" data-icon="octicon-eye" data-size="large" aria-label="Watch">Watch</a>
<a class="github-button" href="https://github.com/vllm-project/vllm-ascend/fork" data-icon="octicon-repo-forked" data-size="large" aria-label="Fork">Fork</a>
</p>
:::
vLLM Ascend plugin (vllm-ascend) is a community maintained hardware plugin for running vLLM on the Ascend NPU.
This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162), providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.
By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Expert, Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.
## Documentation
% How to start using vLLM on Ascend NPU?
:::{toctree}
:caption: Getting Started
:maxdepth: 1
quick_start
installation
:::
% What does vLLM Ascend Plugin support?
:::{toctree}
:caption: Features
:maxdepth: 1
features/suppoted_features
features/supported_models
:::
% How to contribute to the vLLM project
:::{toctree}
:caption: Developer Guide
:maxdepth: 1
developer_guide/contributing
:::

View File

@@ -1,6 +1,6 @@
# Installation
### 1. Dependencies
## Dependencies
| Requirement | Supported version | Recommended version | Note |
| ------------ | ------- | ----------- | ----------- |
| Python | >= 3.9 | [3.10](https://www.python.org/downloads/) | Required for vllm |
@@ -8,11 +8,11 @@
| torch-npu | >= 2.4.0 | [2.5.1rc1](https://gitee.com/ascend/pytorch/releases/tag/v6.0.0.alpha001-pytorch2.5.1) | Required for vllm-ascend |
| torch | >= 2.4.0 | [2.5.1](https://github.com/pytorch/pytorch/releases/tag/v2.5.1) | Required for torch-npu and vllm required |
### 2. Prepare Ascend NPU environment
## Prepare Ascend NPU environment
Below is a quick note to install recommended version software:
#### Containerized installation
### Containerized installation
You can use the [container image](https://hub.docker.com/r/ascendai/cann) directly with one line command:
@@ -33,13 +33,13 @@ docker run \
You do not need to install `torch` and `torch_npu` manually, they will be automatically installed as `vllm-ascend` dependencies.
#### Manual installation
### Manual installation
Or follow the instructions provided in the [Ascend Installation Guide](https://ascend.github.io/docs/sources/ascend/quick_install.html) to set up the environment.
### 3. Building
## Building
#### Build Python package from source
### Build Python package from source
```bash
git clone https://github.com/vllm-project/vllm-ascend.git
@@ -47,7 +47,7 @@ cd vllm-ascend
pip install -e .
```
#### Build container image from source
### Build container image from source
```bash
git clone https://github.com/vllm-project/vllm-ascend.git
cd vllm-ascend

View File

Before

Width:  |  Height:  |  Size: 153 KiB

After

Width:  |  Height:  |  Size: 153 KiB

View File

Before

Width:  |  Height:  |  Size: 153 KiB

After

Width:  |  Height:  |  Size: 153 KiB

View File

@@ -1,6 +1,6 @@
# Quickstart
## 1. Prerequisites
## Prerequisites
### Supported Devices
- Atlas A2 Training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
@@ -48,7 +48,7 @@ You will see following message:
```
## 2. Installation
## Installation
Prepare:
@@ -84,7 +84,7 @@ cd ..
```
## 3. Usage
## Usage
After vLLM and vLLM Ascend plugin installation, you can start to
try [vLLM QuickStart](https://docs.vllm.ai/en/latest/getting_started/quickstart.html).

View File

@@ -1 +0,0 @@
# Running vLLM with Ascend