[Doc] Add patch doc (#1414)
1. Format the developer guide content to make it more clear 2. Add the patch doc for developer guide Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
This commit is contained in:
94
docs/source/developer_guide/contribution/index.md
Normal file
94
docs/source/developer_guide/contribution/index.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Contributing
|
||||
|
||||
## Building and testing
|
||||
It's recommended to set up a local development environment to build and test
|
||||
before you submit a PR.
|
||||
|
||||
### Setup development environment
|
||||
|
||||
Theoretically, the vllm-ascend build is only supported on Linux because
|
||||
`vllm-ascend` dependency `torch_npu` only supports Linux.
|
||||
|
||||
But you can still set up dev env on Linux/Windows/macOS for linting and basic
|
||||
test as following commands:
|
||||
|
||||
```bash
|
||||
# Choose a base dir (~/vllm-project/) and set up venv
|
||||
cd ~/vllm-project/
|
||||
python3 -m venv .venv
|
||||
source ./.venv/bin/activate
|
||||
|
||||
# Clone vllm code and install
|
||||
git clone https://github.com/vllm-project/vllm.git
|
||||
cd vllm
|
||||
pip install -r requirements/build.txt
|
||||
VLLM_TARGET_DEVICE="empty" pip install .
|
||||
cd ..
|
||||
|
||||
# Clone vllm-ascend and install
|
||||
git clone https://github.com/vllm-project/vllm-ascend.git
|
||||
cd vllm-ascend
|
||||
# install system requirement
|
||||
apt install -y gcc g++ cmake libnuma-dev
|
||||
# install project requirement
|
||||
pip install -r requirements-dev.txt
|
||||
|
||||
# Then you can run lint and mypy test
|
||||
bash format.sh
|
||||
|
||||
# Build:
|
||||
# - only supported on Linux (torch_npu available)
|
||||
# pip install -e .
|
||||
# - build without deps for debugging in other OS
|
||||
# pip install -e . --no-deps
|
||||
# - build without custom ops
|
||||
# COMPILE_CUSTOM_KERNELS=0 pip install -e .
|
||||
|
||||
# Commit changed files using `-s`
|
||||
git commit -sm "your commit info"
|
||||
```
|
||||
|
||||
🎉 Congratulations! You have completed the development environment setup.
|
||||
|
||||
### Test locally
|
||||
|
||||
You can refer to [Testing](./testing.md) doc to help you setup testing environment and running tests locally.
|
||||
|
||||
## DCO and Signed-off-by
|
||||
|
||||
When contributing changes to this project, you must agree to the DCO. Commits must include a `Signed-off-by:` header which certifies agreement with the terms of the DCO.
|
||||
|
||||
Using `-s` with `git commit` will automatically add this header.
|
||||
|
||||
## PR Title and Classification
|
||||
|
||||
Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
|
||||
|
||||
- `[Attention]` for new features or optimization in attention.
|
||||
- `[Communicator]` for new features or optimization in communicators.
|
||||
- `[ModelRunner]` for new features or optimization in model runner.
|
||||
- `[Platform]` for new features or optimization in platform.
|
||||
- `[Worker]` for new features or optimization in worker.
|
||||
- `[Core]` for new features or optimization in the core vllm-ascend logic (such as platform, attention, communicators, model runner)
|
||||
- `[Kernel]` changes affecting compute kernels and ops.
|
||||
- `[Bugfix]` for bug fixes.
|
||||
- `[Doc]` for documentation fixes and improvements.
|
||||
- `[Test]` for tests (such as unit tests).
|
||||
- `[CI]` for build or continuous integration improvements.
|
||||
- `[Misc]` for PRs that do not fit the above categories. Please use this sparingly.
|
||||
|
||||
:::{note}
|
||||
If the PR spans more than one category, please include all relevant prefixes.
|
||||
:::
|
||||
|
||||
## Others
|
||||
|
||||
You may find more information about contributing to vLLM Ascend backend plugin on [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html).
|
||||
If you find any problem when contributing, you can feel free to submit a PR to improve the doc to help other developers.
|
||||
|
||||
|
||||
:::{toctree}
|
||||
:caption: Index
|
||||
:maxdepth: 1
|
||||
testing
|
||||
:::
|
||||
183
docs/source/developer_guide/contribution/testing.md
Normal file
183
docs/source/developer_guide/contribution/testing.md
Normal file
@@ -0,0 +1,183 @@
|
||||
# Testing
|
||||
|
||||
This secition explains how to write e2e tests and unit tests to verify the implementation of your feature.
|
||||
|
||||
## Setup test environment
|
||||
|
||||
The fastest way to setup test environment is to use the main branch container image:
|
||||
|
||||
:::::{tab-set}
|
||||
:sync-group: e2e
|
||||
|
||||
::::{tab-item} Single card
|
||||
:selected:
|
||||
:sync: single
|
||||
|
||||
```{code-block} bash
|
||||
:substitutions:
|
||||
|
||||
# Update DEVICE according to your device (/dev/davinci[0-7])
|
||||
export DEVICE=/dev/davinci0
|
||||
# Update the vllm-ascend image
|
||||
export IMAGE=quay.io/ascend/vllm-ascend:main
|
||||
docker run --rm \
|
||||
--name vllm-ascend \
|
||||
--device $DEVICE \
|
||||
--device /dev/davinci_manager \
|
||||
--device /dev/devmm_svm \
|
||||
--device /dev/hisi_hdc \
|
||||
-v /usr/local/dcmi:/usr/local/dcmi \
|
||||
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
|
||||
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
|
||||
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
|
||||
-v /etc/ascend_install.info:/etc/ascend_install.info \
|
||||
-v /root/.cache:/root/.cache \
|
||||
-p 8000:8000 \
|
||||
-it $IMAGE bash
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
::::{tab-item} Multi cards
|
||||
:sync: multi
|
||||
|
||||
```{code-block} bash
|
||||
:substitutions:
|
||||
# Update the vllm-ascend image
|
||||
export IMAGE=quay.io/ascend/vllm-ascend:main
|
||||
docker run --rm \
|
||||
--name vllm-ascend \
|
||||
--device /dev/davinci0 \
|
||||
--device /dev/davinci1 \
|
||||
--device /dev/davinci2 \
|
||||
--device /dev/davinci3 \
|
||||
--device /dev/davinci_manager \
|
||||
--device /dev/devmm_svm \
|
||||
--device /dev/hisi_hdc \
|
||||
-v /usr/local/dcmi:/usr/local/dcmi \
|
||||
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
|
||||
-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
|
||||
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
|
||||
-v /etc/ascend_install.info:/etc/ascend_install.info \
|
||||
-v /root/.cache:/root/.cache \
|
||||
-p 8000:8000 \
|
||||
-it $IMAGE bash
|
||||
```
|
||||
::::
|
||||
|
||||
:::::
|
||||
|
||||
After starting the container, you should install the required packages:
|
||||
|
||||
```bash
|
||||
# Prepare
|
||||
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
|
||||
|
||||
# Install required packages
|
||||
pip install -r requirements-dev.txt
|
||||
```
|
||||
|
||||
## Running tests
|
||||
|
||||
### Unit test
|
||||
|
||||
There are several principles to follow when writing unit tests:
|
||||
|
||||
- The test file path should be consistent with source file and start with `test_` prefix, such as: `vllm_ascend/worker/worker_v1.py` --> `tests/ut/worker/test_worker_v1.py`
|
||||
- The vLLM Ascend test are using unittest framework, see [here](https://docs.python.org/3/library/unittest.html#module-unittest) to understand how to write unit tests.
|
||||
- All unit tests can be run on CPU, so you must mock the device-related function to host.
|
||||
- Example: [tests/ut/test_ascend_config.py](https://github.com/vllm-project/vllm-ascend/blob/main/tests/ut/test_ascend_config.py).
|
||||
- You can run the unit tests using `pytest`:
|
||||
|
||||
```bash
|
||||
cd /vllm-workspace/vllm-ascend/
|
||||
# Run all single card the tests
|
||||
pytest -sv tests/ut
|
||||
|
||||
# Run
|
||||
pytest -sv tests/ut/test_ascend_config.py
|
||||
```
|
||||
|
||||
### E2E test
|
||||
|
||||
Although vllm-ascend CI provide [e2e test](https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml) on Ascend CI, you can run it
|
||||
locally.
|
||||
|
||||
:::::{tab-set}
|
||||
:sync-group: e2e
|
||||
|
||||
::::{tab-item} Single card
|
||||
:selected:
|
||||
:sync: single
|
||||
|
||||
```bash
|
||||
cd /vllm-workspace/vllm-ascend/
|
||||
# Run all single card the tests
|
||||
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/singlecard/
|
||||
|
||||
# Run a certain test script
|
||||
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/singlecard/test_offline_inference.py
|
||||
|
||||
# Run a certain case in test script
|
||||
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/singlecard/test_offline_inference.py::test_models
|
||||
```
|
||||
::::
|
||||
|
||||
::::{tab-item} Multi cards test
|
||||
:sync: multi
|
||||
```bash
|
||||
cd /vllm-workspace/vllm-ascend/
|
||||
# Run all single card the tests
|
||||
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/multicard/
|
||||
|
||||
# Run a certain test script
|
||||
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/multicard/test_dynamic_npugraph_batchsize.py
|
||||
|
||||
# Run a certain case in test script
|
||||
VLLM_USE_MODELSCOPE=true pytest -sv tests/e2e/multicard/test_offline_inference.py::test_models
|
||||
```
|
||||
::::
|
||||
|
||||
:::::
|
||||
|
||||
This will reproduce e2e test: [vllm_ascend_test.yaml](https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_test.yaml).
|
||||
|
||||
#### E2E test example:
|
||||
|
||||
- Offline test example: [`tests/e2e/singlecard/test_offline_inference.py`](https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/singlecard/test_offline_inference.py)
|
||||
- Online test examples: [`tests/e2e/singlecard/test_prompt_embedding.py`](https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/singlecard/test_prompt_embedding.py)
|
||||
- Correctness test example: [`tests/e2e/singlecard/test_aclgraph.py`](https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/singlecard/test_aclgraph.py)
|
||||
- Reduced Layer model test example: [test_torchair_graph_mode.py - DeepSeek-V3-Pruning](https://github.com/vllm-project/vllm-ascend/blob/20767a043cccb3764214930d4695e53941de87ec/tests/e2e/multicard/test_torchair_graph_mode.py#L48)
|
||||
|
||||
The CI resource is limited, you might need to reduce layer number of the model, below is an example of how to generate a reduced layer model:
|
||||
1. Fork the original model repo in modelscope, we need all the files in the repo except for weights.
|
||||
2. Set `num_hidden_layers` to the expected number of layers, e.g., `{"num_hidden_layers": 2,}`
|
||||
3. Copy the following python script as `generate_random_weight.py`. Set the relevant parameters `MODEL_LOCAL_PATH`, `DIST_DTYPE` and `DIST_MODEL_PATH` as needed:
|
||||
|
||||
```python
|
||||
import torch
|
||||
from transformers import AutoTokenizer, AutoConfig
|
||||
from modeling_deepseek import DeepseekV3ForCausalLM
|
||||
from modelscope import snapshot_download
|
||||
|
||||
MODEL_LOCAL_PATH = "~/.cache/modelscope/models/vllm-ascend/DeepSeek-V3-Pruning"
|
||||
DIST_DTYPE = torch.bfloat16
|
||||
DIST_MODEL_PATH = "./random_deepseek_v3_with_2_hidden_layer"
|
||||
|
||||
config = AutoConfig.from_pretrained(MODEL_LOCAL_PATH, trust_remote_code=True)
|
||||
model = DeepseekV3ForCausalLM(config)
|
||||
model = model.to(DIST_DTYPE)
|
||||
model.save_pretrained(DIST_MODEL_PATH)
|
||||
```
|
||||
|
||||
### Run doctest
|
||||
|
||||
vllm-ascend provides a `vllm-ascend/tests/e2e/run_doctests.sh` command to run all doctests in the doc files.
|
||||
The doctest is a good way to make sure the docs are up to date and the examples are executable, you can run it locally as follows:
|
||||
|
||||
```bash
|
||||
# Run doctest
|
||||
/vllm-workspace/vllm-ascend/tests/e2e/run_doctests.sh
|
||||
```
|
||||
|
||||
This will reproduce the same environment as the CI: [vllm_ascend_doctest.yaml](https://github.com/vllm-project/vllm-ascend/blob/main/.github/workflows/vllm_ascend_doctest.yaml).
|
||||
Reference in New Issue
Block a user