[Doc] Add 0.8.5rc1 release note (#756)

### What this PR does / why we need it? Add 0.8.5rc1 release note and bump vllm version to v0.8.5.post1 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? CI passed --------- Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
2025-05-06 23:46:35 +08:00
parent 2cd036ee8e
commit ec27af346a
7 changed files with 45 additions and 26 deletions
--- a/docs/source/user_guide/suppoted_features.md
+++ b/docs/source/user_guide/suppoted_features.md
@@ -6,18 +6,18 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th

 | Feature                       | vLLM V0 Engine | vLLM V1 Engine | Next Step                                                              |
 |-------------------------------|----------------|----------------|------------------------------------------------------------------------|
-| Chunked Prefill               | 🚧 WIP         | 🚧 WIP         | Functional, waiting for CANN 8.1 nnal package release                  |
-| Automatic Prefix Caching      | 🚧 WIP         | 🚧 WIP         | Functional, waiting for CANN 8.1 nnal package release                  |
+| Chunked Prefill               | 🚧 WIP         | 🟢 Functional  | Functional, see detail note: [Chunked Prefill][cp]                     |
+| Automatic Prefix Caching      | 🚧 WIP         | 🟢 Functional  | Functional, see detail note: [vllm-ascend#732][apc]                    |
 | LoRA                          | 🟢 Functional  | 🚧 WIP         | [vllm-ascend#396][multilora], CI needed, working on V1 support         |
-| Prompt adapter                | No plan        | 🟡 Planned     | Plan in 2025.06.30                                                     |
+| Prompt adapter                | 🔴 No plan     | 🟡 Planned     | Plan in 2025.06.30                                                     |
 | Speculative decoding          | 🟢 Functional  | 🚧 WIP         | CI needed; working on V1 support                                       |
-| Pooling                       | 🟢 Functional  | 🟢 Functional  | CI needed and adapting more models; V1 support rely on vLLM support.   |
+| Pooling                       | 🟢 Functional  | 🟡 Planned     | CI needed and adapting more models; V1 support rely on vLLM support.   |
 | Enc-dec                       | 🔴 NO plan     | 🟡 Planned     | Plan in 2025.06.30                                                     |
 | Multi Modality                | 🟢 Functional  | 🟢 Functional  | [Tutorial][multimodal], optimizing and adapting more models            |
 | LogProbs                      | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
 | Prompt logProbs               | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
 | Async output                  | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
-| Multi step scheduler          | 🟢 Functional  | 🔴 Deprecated  | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler])    | 
+| Multi step scheduler          | 🟢 Functional  | 🔴 Deprecated  | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler]     |
 | Best of                       | 🟢 Functional  | 🔴 Deprecated  | [vllm#13361][best_of], CI needed                                       |
 | Beam search                   | 🟢 Functional  | 🟢 Functional  | CI needed                                                              |
 | Guided Decoding               | 🟢 Functional  | 🟢 Functional  | [vllm-ascend#177][guided_decoding]                                     |
@@ -27,11 +27,12 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
 | Data Parallel                 | 🔴 NO plan     | 🟢 Functional  | CI needed;  No plan on V0 support                                      |
 | Prefill Decode Disaggregation | 🟢 Functional  | 🟢 Functional  | 1P1D available, working on xPyD and V1 support.                        |
 | Quantization                  | 🟢 Functional  | 🟢 Functional  | W8A8 available, CI needed; working on more quantization method support |
-| Graph Mode                    | 🔴 NO plan     | 🟢 Functional  | Functional, waiting for CANN 8.1 nnal package release                  |
+| Graph Mode                    | 🔴 NO plan     | 🔵 Experimental| Experimental, see detail note: [vllm-ascend#767][graph_mode]           |
 | Sleep Mode                    | 🟢 Functional  | 🟢 Functional  | level=1 available, CI needed, working on V1 support                    |

 - 🟢 Functional: Fully operational, with ongoing optimizations.
- 🚧 WIP: Under active development
+- 🔵 Experimental: Experimental support, interfaces and functions may change.
+- 🚧 WIP: Under active development, will be supported soon.
 - 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs).
 - 🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1.

@@ -42,3 +43,6 @@ You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is th
 [v1_scheduler]: https://github.com/vllm-project/vllm/blob/main/vllm/v1/core/sched/scheduler.py
 [v1_rfc]: https://github.com/vllm-project/vllm/issues/8779
 [multilora]: https://github.com/vllm-project/vllm-ascend/issues/396
+[graph_mode]: https://github.com/vllm-project/vllm-ascend/issues/767
+[apc]: https://github.com/vllm-project/vllm-ascend/issues/732
+[cp]: https://docs.vllm.ai/en/stable/performance/optimization.html#chunked-prefill