[Doc] Add release note (#59)
Add release note template and init the first release note content Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
This commit is contained in:
13
docs/source/user_guide/release.template.md
Normal file
13
docs/source/user_guide/release.template.md
Normal file
@@ -0,0 +1,13 @@
|
||||
## {version}
|
||||
### Highlights
|
||||
- {feature}
|
||||
### Bug fixes
|
||||
- {bug}
|
||||
### Other changes
|
||||
- {change}
|
||||
### Known issues
|
||||
- {issue}
|
||||
### Upgrade Notes
|
||||
- {upgrade}
|
||||
### Deprecation Notes
|
||||
- {deprecation}
|
||||
20
docs/source/user_guide/release_notes.md
Normal file
20
docs/source/user_guide/release_notes.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Release note
|
||||
|
||||
## v0.7.1.rc1
|
||||
|
||||
We are excited to announce the release candidate of v0.7.1 for vllm-ascend. vllm-ascend is a community maintained hardware plugin for running vLLM on the Ascend NPU. With this release, users can now enjoy the latest features and improvements of vLLM on the Ascend NPU.
|
||||
|
||||
Note that this is a release candidate, and there may be some bugs or issues. We appreciate your feedback and suggestions [here](https://github.com/vllm-project/vllm-ascend/issues/19)
|
||||
|
||||
### Highlights
|
||||
|
||||
- The first release which official supports the Ascend NPU on vLLM originally. Please follow the [official doc](https://vllm-ascend.readthedocs.io/en/latest/) to start the journey.
|
||||
|
||||
### Other changes
|
||||
|
||||
- Added the Ascend quantization config option, the implementation will comming soon.
|
||||
|
||||
### Known issues
|
||||
|
||||
- This release relies on an unreleased torch_npu version. Please [install](https://vllm-ascend.readthedocs.io/en/latest/installation.html) it manually.
|
||||
- There are logs like `No platform deteced, vLLM is running on UnspecifiedPlatform` or `Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm._C'")` shown when runing vllm-ascend. It actually doesn't affect any functionality and performance. You can just ignore it. And it has been fixed in this [PR](https://github.com/vllm-project/vllm/pull/12432) which will be included in v0.7.3 soon.
|
||||
24
docs/source/user_guide/supported_models.md
Normal file
24
docs/source/user_guide/supported_models.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Supported Models
|
||||
|
||||
| Model | Supported | Note |
|
||||
|---------|-----------|------|
|
||||
| Qwen 2.5 | ✅ ||
|
||||
| Mistral | | Need test |
|
||||
| DeepSeek v2.5 | |Need test |
|
||||
| LLama3.1/3.2 | ✅ ||
|
||||
| Gemma-2 | |Need test|
|
||||
| baichuan | |Need test|
|
||||
| minicpm | |Need test|
|
||||
| internlm | ✅ ||
|
||||
| ChatGLM | ✅ ||
|
||||
| InternVL 2.5 | ✅ ||
|
||||
| Qwen2-VL | ✅ ||
|
||||
| GLM-4v | |Need test|
|
||||
| Molomo | ✅ ||
|
||||
| LLaVA 1.5 | ✅ ||
|
||||
| Mllama | |Need test|
|
||||
| LLaVA-Next | |Need test|
|
||||
| LLaVA-Next-Video | |Need test|
|
||||
| Phi-3-Vison/Phi-3.5-Vison | |Need test|
|
||||
| Ultravox | |Need test|
|
||||
| Qwen2-Audio | ✅ ||
|
||||
21
docs/source/user_guide/suppoted_features.md
Normal file
21
docs/source/user_guide/suppoted_features.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# Feature Support
|
||||
|
||||
| Feature | Supported | Note |
|
||||
|---------|-----------|------|
|
||||
| Chunked Prefill | ✗ | Plan in 2025 Q1 |
|
||||
| Automatic Prefix Caching | ✅ | Improve performance in 2025 Q2 |
|
||||
| LoRA | ✗ | Plan in 2025 Q1 |
|
||||
| Prompt adapter | ✗ | Plan in 2025 Q1 |
|
||||
| Speculative decoding | ✗ | Plan in 2025 Q1 |
|
||||
| Pooling | ✗ | Plan in 2025 Q2 |
|
||||
| Enc-dec | ✗ | Plan in 2025 Q2 |
|
||||
| Multi Modality | ✅ (LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Add more model support in 2025 Q1 |
|
||||
| LogProbs | ✅ ||
|
||||
| Prompt logProbs | ✅ ||
|
||||
| Async output | ✅ ||
|
||||
| Multi step scheduler | ✗ | Plan in 2025 Q1 |
|
||||
| Best of | ✅ ||
|
||||
| Beam search | ✅ ||
|
||||
| Guided Decoding | ✗ | Plan in 2025 Q1 |
|
||||
| Tensor Parallel | ✅ | Only "mp" supported now |
|
||||
| Pipeline Parallel | ✅ | Only "mp" supported now |
|
||||
Reference in New Issue
Block a user