Files
xc-llm-ascend/docs/source/user_guide/release_notes.md
wangxiyuan fafd70e91c [Doc] Update doc to work with release (#85)
1. Update CANN image name
2. Add pta install step
3. update vllm-ascend docker image name to ghcr
4. update quick_start to use vllm-ascend image directly.
5. fix `note` style

Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2025-02-19 09:51:43 +08:00

2.8 KiB

Release note

v0.7.1.rc1

🎉 Hello, World!

We are excited to announce the first release candidate of v0.7.1 for vllm-ascend.

vLLM Ascend Plugin (vllm-ascend) is a community maintained hardware plugin for running vLLM on the Ascend NPU. With this release, users can now enjoy the latest features and improvements of vLLM on the Ascend NPU.

Please follow the official doc to start the journey. Note that this is a release candidate, and there may be some bugs or issues. We appreciate your feedback and suggestions here

Highlights

  • Initial supports for Ascend NPU on vLLM. #3
  • DeepSeek is now supported. #88 #68
  • Qwen, Llama series and other popular models are also supported, you can see more details in here.

Core

  • Added the Ascend quantization config option, the implementation will comming soon. #7 #73
  • Add silu_and_mul and rope ops and add mix ops into attention layer. #18

Other

  • [CI] Enable Ascend CI to actively monitor and improve quality for vLLM on Ascend. #3
  • [Docker] Add vllm-ascend container image #64
  • [Docs] Add a live doc #55

Known issues

  • This release relies on an unreleased torch_npu version. It has been installed within official container image already. Please install it manually if you are using non-container environment.
  • There are logs like No platform deteced, vLLM is running on UnspecifiedPlatform or Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'") shown when runing vllm-ascend. It actually doesn't affect any functionality and performance. You can just ignore it. And it has been fixed in this PR which will be included in v0.7.3 soon.
  • There are logs like # CPU blocks: 35064, # CPU blocks: 2730 shown when runing vllm-ascend which should be # NPU blocks: . It actually doesn't affect any functionality and performance. You can just ignore it. And it has been fixed in this PR which will be included in v0.7.3 soon.