[Doc] Add release note (#59)

Add release note template and init the first release note content Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
2025-02-18 11:20:06 +08:00
parent 7cc024a2d3
commit 7606977739
7 changed files with 91 additions and 17 deletions
--- a/docs/source/user_guide/release.template.md
+++ b/docs/source/user_guide/release.template.md
@@ -0,0 +1,13 @@
+## {version}
+### Highlights
+- {feature}
+### Bug fixes
+- {bug}
+### Other changes
+- {change}
+### Known issues
+- {issue}
+### Upgrade Notes
+- {upgrade}
+### Deprecation Notes
+- {deprecation}
--- a/docs/source/user_guide/release_notes.md
+++ b/docs/source/user_guide/release_notes.md
@@ -0,0 +1,20 @@
+# Release note
+
+## v0.7.1.rc1
+
+We are excited to announce the release candidate of v0.7.1 for vllm-ascend. vllm-ascend is a community maintained hardware plugin for running vLLM on the Ascend NPU. With this release, users can now enjoy the latest features and improvements of vLLM on the Ascend NPU.
+
+Note that this is a release candidate, and there may be some bugs or issues. We appreciate your feedback and suggestions [here](https://github.com/vllm-project/vllm-ascend/issues/19)
+
+### Highlights
+
+- The first release which official supports the Ascend NPU on vLLM originally. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/latest/) to start the journey.
+
+### Other changes
+
+- Added the Ascend quantization config option, the implementation will comming soon.
+
+### Known issues
+
+- This release relies on an unreleased torch_npu version. Please [install](https://vllm-ascend.readthedocs.io/en/latest/installation.html) it manually.
+- There are logs like `No platform deteced, vLLM is running on UnspecifiedPlatform` or `Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")` shown when runing vllm-ascend. It actually doesn't affect any functionality and performance. You can just ignore it. And it has been fixed in this [PR](https://github.com/vllm-project/vllm/pull/12432) which will be included in v0.7.3 soon.
--- a/docs/source/user_guide/supported_models.md
+++ b/docs/source/user_guide/supported_models.md
@@ -0,0 +1,24 @@
+# Supported Models
+
+| Model | Supported | Note |
+|---------|-----------|------|
+| Qwen 2.5 | ✅ ||
+| Mistral |  | Need test |
+| DeepSeek v2.5 | |Need test |
+| LLama3.1/3.2 | ✅ ||
+| Gemma-2 |  |Need test|
+| baichuan |  |Need test|
+| minicpm |  |Need test|
+| internlm | ✅ ||
+| ChatGLM | ✅ ||
+| InternVL 2.5 | ✅ ||
+| Qwen2-VL | ✅ ||
+| GLM-4v |  |Need test|
+| Molomo | ✅ ||
+| LLaVA 1.5 | ✅ ||
+| Mllama |  |Need test|
+| LLaVA-Next |  |Need test|
+| LLaVA-Next-Video |  |Need test|
+| Phi-3-Vison/Phi-3.5-Vison |  |Need test|
+| Ultravox |  |Need test|
+| Qwen2-Audio | ✅ ||
--- a/docs/source/user_guide/suppoted_features.md
+++ b/docs/source/user_guide/suppoted_features.md
@@ -0,0 +1,21 @@
+# Feature Support
+
+| Feature | Supported | Note |
+|---------|-----------|------|
+| Chunked Prefill | ✗ | Plan in 2025 Q1 |
+| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q2 |
+| LoRA | ✗ | Plan in 2025 Q1 |
+| Prompt adapter | ✗ | Plan in 2025 Q1 |
+| Speculative decoding | ✗ | Plan in 2025 Q1 |
+| Pooling | ✗ | Plan in 2025 Q2 |
+| Enc-dec | ✗ | Plan in 2025 Q2 |
+| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
+| LogProbs | ✅ ||
+| Prompt logProbs | ✅ ||
+| Async output | ✅ ||
+| Multi step scheduler | ✗ | Plan in 2025 Q1 |
+| Best of | ✅ ||
+| Beam search | ✅ ||
+| Guided Decoding | ✗ | Plan in 2025 Q1 |
+| Tensor Parallel | ✅ | Only "mp" supported now |
+| Pipeline Parallel | ✅ | Only "mp" supported now |