[Core] Init vllm-ascend (#3)

### What this PR does / why we need it? vLLM Ascend plugin (vllm-ascend) is a backend plugin for running vLLM on the Ascend NPU. This plugin is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [RFC]: Hardware pluggable, providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM. This patch also include changes to make CI work and use cache speed up e2e test, including: 1. Change push (post merge ci) and pull_request (pr ci) trigger branch to main 2. Make mypy work by ignore base_communicator and clear unused deps 3. Several improvements for vllm_ascend_test: - use cache (pip, ms, hf) speed up e2e test (25mins --> 5mins) - switch `git clone` command to `action/checkout` to speedup checkout and - Enable sv for pytest for better info dump - Remove network host to resole `docker: conflicting ontions: cannot attach both user-defined and non-user-definednetwork-modes`, which is a problem on docker 1.45 but not on 1.39. 4. Adapt MLA decode optimizations: cabaf4eff3 ### Does this PR introduce _any_ user-facing change? Yes, init the PR. ### How was this patch tested? - This is the first PR to make ascend NPU work on vLLM. All code is tested on ascend with vLLM V0 Engine. - CI passed --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: wangshuai09 <391746016@qq.com> Co-authored-by: Shanshan Shen <467638484@qq.com> Co-authored-by: wangli <wangli858794774@gmail.com>
2025-02-05 10:53:12 +08:00
parent eb283428dd
commit d5e7756028
47 changed files with 5161 additions and 0 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,107 @@
+# Contributing to vLLM Ascend plugin
+
+## Building and testing
+It's recommended to set up a local development environment to build and test
+before you submit a PR.
+
+### Prepare environment and build
+
+Theoretically, the vllm-ascend build is only supported on Linux because
+`vllm-ascend` dependency `torch_npu` only supports Linux.
+
+But you can still set up dev env on Linux/Windows/macOS for linting and basic
+test as following commands:
+
+```bash
+# Choose a base dir (~/vllm-project/) and set up venv
+cd ~/vllm-project/
+python3 -m venv .venv
+source ./.venv/bin/activate
+
+# Clone vllm code and install
+git clone https://github.com/vllm-project/vllm.git
+cd vllm
+pip install -r requirements-build.txt
+VLLM_TARGET_DEVICE="empty" pip install .
+cd ..
+
+# Clone vllm-ascend and install
+git clone https://github.com/vllm-project/vllm-ascend.git
+cd vllm-ascend
+pip install -r requirements-dev.txt
+
+# Then you can run lint and mypy test
+bash format.sh
+
+# Build:
+# - only supported on Linux (torch_npu available)
+# pip install -e .
+# - build without deps for debugging in other OS
+# pip install -e . --no-deps
+
+# Commit changed files using `-s`
+git commit -sm "your commit info"
+```
+
+### Testing
+
+Although vllm-ascend CI provide integration test on [Ascend](.github/workflows/vllm_ascend_test.yaml), you can run it
+locally. The simplest way to run these integration tests locally is through a container:
+
+```bash
+# Under Ascend NPU environment
+git clone https://github.com/vllm-project/vllm-ascend.git
+cd vllm-ascend
+
+IMAGE=vllm-ascend-dev-image
+CONTAINER_NAME=vllm-ascend-dev
+DEVICE=/dev/davinci1
+
+# The first build will take about 10 mins (10MB/s) to download the base image and packages
+docker build -t $IMAGE -f ./Dockerfile .
+# You can also specify the mirror repo via setting VLLM_REPO to speedup
+# docker build -t $IMAGE -f ./Dockerfile . --build-arg VLLM_REPO=https://gitee.com/mirrors/vllm
+
+docker run --name $CONTAINER_NAME --network host --device $DEVICE \
+           --device /dev/davinci_manager --device /dev/devmm_svm \
+           --device /dev/hisi_hdc -v /usr/local/dcmi:/usr/local/dcmi \
+           -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
+           -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
+           -ti --rm $IMAGE bash
+
+cd vllm-ascend
+pip install -r requirements-dev.txt
+
+pytest tests/
+```
+
+## DCO and Signed-off-by
+
+When contributing changes to this project, you must agree to the DCO. Commits must include a `Signed-off-by:` header which certifies agreement with the terms of the DCO.
+
+Using `-s` with `git commit` will automatically add this header.
+
+## PR Title and Classification
+
+Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
+
+- `[Attention]` for new features or optimization in attention.
+- `[Communicator]` for new features or optimization in communicators.
+- `[ModelRunner]` for new features or optimization in model runner.
+- `[Platform]` for new features or optimization in platform.
+- `[Worker]` for new features or optimization in worker.
+- `[Core]` for new features or optimization  in the core vllm-ascend logic (such as platform, attention, communicators, model runner)
+- `[Kernel]` changes affecting compute kernels and ops.
+- `[Bugfix]` for bug fixes.
+- `[Doc]` for documentation fixes and improvements.
+- `[Test]` for tests (such as unit tests).
+- `[CI]` for build or continuous integration improvements.
+- `[Misc]` for PRs that do not fit the above categories. Please use this sparingly.
+
+> [!NOTE]
+> If the PR spans more than one category, please include all relevant prefixes.
+
+## Others
+
+You may find more information about contributing to vLLM Ascend backend plugin on [<u>docs.vllm.ai</u>](https://docs.vllm.ai/en/latest/contributing/overview.html).
+If you find any problem when contributing, you can feel free to submit a PR to improve the doc to help other developers.