Add release notes for v0.16.0rc1
- vLLM version: v0.16.0
- vLLM main:
4034c3d32e
---------
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Signed-off-by: Canlin Guo <961750412@qq.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
107 lines
8.6 KiB
Markdown
107 lines
8.6 KiB
Markdown
<p align="center">
|
||
<picture>
|
||
<source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/vllm-project/vllm-ascend/main/docs/source/logos/vllm-ascend-logo-text-dark.png">
|
||
<img alt="vllm-ascend" src="https://raw.githubusercontent.com/vllm-project/vllm-ascend/main/docs/source/logos/vllm-ascend-logo-text-light.png" width=55%>
|
||
</picture>
|
||
</p>
|
||
|
||
<h3 align="center">
|
||
vLLM Ascend Plugin
|
||
</h3>
|
||
|
||
<div align="center">
|
||
|
||
[](https://deepwiki.com/vllm-project/vllm-ascend)
|
||
|
||
</div>
|
||
|
||
<p align="center">
|
||
| <a href="https://www.hiascend.com/en/"><b>About Ascend</b></a> | <a href="https://docs.vllm.ai/projects/ascend/en/latest/"><b>Documentation</b></a> | <a href="https://slack.vllm.ai"><b>#SIG-Ascend</b></a> | <a href="https://discuss.vllm.ai/c/hardware-support/vllm-ascend-support"><b>Users Forum</b></a> | <a href="https://tinyurl.com/vllm-ascend-meeting"><b>Weekly Meeting</b></a> |
|
||
</p>
|
||
|
||
<p align="center">
|
||
<a ><b>English</b></a> | <a href="README.zh.md"><b>中文</b></a>
|
||
</p>
|
||
|
||
---
|
||
*Latest News* 🔥
|
||
|
||
- [2026/02] We released the new official version [v0.13.0](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.13.0)! Please follow the [official guide](https://docs.vllm.ai/projects/ascend/en/v0.13.0/) to start using vLLM Ascend Plugin on Ascend.
|
||
- [2025/12] We released the new official version [v0.11.0](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.11.0)! Please follow the [official guide](https://docs.vllm.ai/projects/ascend/en/v0.11.0/) to start using vLLM Ascend Plugin on Ascend.
|
||
- [2025/09] We released the new official version [v0.9.1](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.9.1)! Please follow the [official guide](https://docs.vllm.ai/projects/ascend/en/v0.9.1/tutorials/large_scale_ep.html) to start deploying large-scale Expert Parallelism (EP) on Ascend.
|
||
- [2025/08] We hosted the [vLLM Beijing Meetup](https://mp.weixin.qq.com/s/7n8OYNrCC_I9SJaybHA_-Q) with vLLM and Tencent! Please find the meetup slides [here](https://drive.google.com/drive/folders/1Pid6NSFLU43DZRi0EaTcPgXsAzDvbBqF).
|
||
- [2025/06] [User stories](https://docs.vllm.ai/projects/ascend/en/latest/community/user_stories/index.html) page is now live! It kicks off with LLaMA-Factory/verl/TRL/GPUStack to demonstrate how vLLM Ascend assists Ascend users in enhancing their experience across fine-tuning, evaluation, reinforcement learning (RL), and deployment scenarios.
|
||
- [2025/06] [Contributors](https://docs.vllm.ai/projects/ascend/en/latest/community/contributors.html) page is now live! All contributions deserve to be recorded, thanks for all contributors.
|
||
- [2025/05] We've released the first official version [v0.7.3](https://github.com/vllm-project/vllm-ascend/releases/tag/v0.7.3)! We collaborated with the vLLM community to publish a blog post sharing our practice: [Introducing vLLM Hardware Plugin, Best Practice from Ascend NPU](https://blog.vllm.ai/2025/05/12/hardware-plugin.html).
|
||
- [2025/03] We hosted the [vLLM Beijing Meetup](https://mp.weixin.qq.com/s/VtxO9WXa5fC-mKqlxNUJUQ) with vLLM team! Please find the meetup slides [here](https://drive.google.com/drive/folders/1Pid6NSFLU43DZRi0EaTcPgXsAzDvbBqF).
|
||
- [2025/02] vLLM community officially created [vllm-project/vllm-ascend](https://github.com/vllm-project/vllm-ascend) repo for running vLLM seamlessly on the Ascend NPU.
|
||
- [2024/12] We are working with the vLLM community to support [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162).
|
||
|
||
---
|
||
|
||
## Overview
|
||
|
||
vLLM Ascend (`vllm-ascend`) is a community maintained hardware plugin for running vLLM seamlessly on the Ascend NPU.
|
||
|
||
It is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [[RFC]: Hardware pluggable](https://github.com/vllm-project/vllm/issues/11162), providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.
|
||
|
||
By using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Experts (MoE), Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.
|
||
|
||
## Prerequisites
|
||
|
||
- Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series, Atlas 800I A3 Inference series, Atlas A3 Training series, Atlas 300I Duo (Experimental)
|
||
- OS: Linux
|
||
- Software:
|
||
- Python >= 3.10, < 3.12
|
||
- CANN == 8.5.0 (Ascend HDK version refers to [here](https://www.hiascend.com/document/detail/zh/canncommercial/83RC2/releasenote/releasenote_0000.html))
|
||
- PyTorch == 2.9.0, torch-npu == 2.9.0
|
||
- vLLM (the same version as vllm-ascend)
|
||
|
||
## Getting Started
|
||
|
||
Please use the following recommended versions to get started quickly:
|
||
|
||
| Version | Release type | Doc |
|
||
|------------|--------------|--------------------------------------|
|
||
| v0.16.0rc1 | Latest release candidate | See [QuickStart](https://docs.vllm.ai/projects/ascend/en/latest/quick_start.html) and [Installation](https://docs.vllm.ai/projects/ascend/en/latest/installation.html) for more details |
|
||
| v0.13.0 | Latest stable version | See [QuickStart](https://docs.vllm.ai/projects/ascend/en/v0.13.0/quick_start.html) and [Installation](https://docs.vllm.ai/projects/ascend/en/v0.13.0/installation.html) for more details |
|
||
|
||
## Contributing
|
||
|
||
See [CONTRIBUTING](https://docs.vllm.ai/projects/ascend/en/latest/developer_guide/contribution/index.html) for more details, which is a step-by-step guide to help you set up the development environment, build and test.
|
||
|
||
We welcome and value any contributions and collaborations:
|
||
|
||
- Please let us know if you encounter a bug by [filing an issue](https://github.com/vllm-project/vllm-ascend/issues)
|
||
- Please use [User forum](https://discuss.vllm.ai/c/hardware-support/vllm-ascend-support) for usage questions and help.
|
||
|
||
## Branch
|
||
|
||
vllm-ascend has a main branch and a dev branch.
|
||
|
||
- **main**: main branch, corresponds to the vLLM main branch, and is continuously monitored for quality through Ascend CI.
|
||
- **releases/vX.Y.Z**: development branch, created alongside new releases of vLLM. For example, `releases/v0.13.0` is the dev branch for vLLM `v0.13.0` version.
|
||
|
||
Below are the maintained branches:
|
||
|
||
| Branch | Status | Note |
|
||
|------------|--------------|--------------------------------------|
|
||
| main | Maintained | CI commitment for vLLM main branch and vLLM v0.16.0 tag |
|
||
| v0.7.1-dev | Unmaintained | Only doc fixes are allowed |
|
||
| v0.7.3-dev | Maintained | CI commitment for vLLM 0.7.3 version, only bug fixes are allowed, and no new release tags anymore. |
|
||
| v0.9.1-dev | Maintained | CI commitment for vLLM 0.9.1 version |
|
||
| v0.11.0-dev | Maintained | CI commitment for vLLM 0.11.0 version |
|
||
| releases/v0.13.0 | Maintained | CI commitment for vLLM 0.13.0 version |
|
||
| rfc/feature-name | Maintained | [Feature branches](https://docs.vllm.ai/projects/ascend/en/latest/community/versioning_policy.html#feature-branches) for collaboration |
|
||
|
||
Please refer to [Versioning policy](https://docs.vllm.ai/projects/ascend/en/latest/community/versioning_policy.html) for more details.
|
||
|
||
## Weekly Meeting
|
||
|
||
- vLLM Ascend Weekly Meeting: <https://tinyurl.com/vllm-ascend-meeting>
|
||
- Wednesday, 15:00 - 16:00 (UTC+8, [Convert to your timezone](https://dateful.com/convert/gmt8?t=15))
|
||
|
||
## License
|
||
|
||
Apache License 2.0, as found in the [LICENSE](./LICENSE) file.
|