From e945e919331d8856cd01db7c467b61da4bf76c61 Mon Sep 17 00:00:00 2001 From: herizhen <59841270+herizhen@users.noreply.github.com> Date: Tue, 25 Nov 2025 14:21:13 +0800 Subject: [PATCH] Document error correction (#4422) ### What this PR does / why we need it? The "g" at the beginning of the current sentence is redundant and needs to be deleted "MindIE Turbo" is no longer required to be displayed. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? ut - vLLM main: https://github.com/vllm-project/vllm/commit/2918c1b49c88c29783c86f78d2c4221cb9622379 --------- Signed-off-by: herizhen Co-authored-by: herizhen --- .../developer_guide/feature_guide/KV_Cache_Pool_Guide.md | 2 +- .../performance_and_debug/optimization_and_tuning.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md b/docs/source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md index f29595f5..8841c4b7 100644 --- a/docs/source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md +++ b/docs/source/developer_guide/feature_guide/KV_Cache_Pool_Guide.md @@ -80,4 +80,4 @@ The KV Connector methods that need to be implemented can be categorized into sch 1. Currently, Mooncake Store for vLLM-Ascend only supports DRAM as the storage for KV Cache pool. -2. For now, if we successfully looked up a key and found it exists, but failed to get it when calling KV Pool's get function, we just output a log indicating the get operation failed and keep going; hence, the accuracy of that specific request may be affected. gWe will handle this situation by falling back the request and re-compute everything assuming there's no prefix cache hit (or even better, revert only one block and keep using the Prefix Caches before that). +2. For now, if we successfully looked up a key and found it exists, but failed to get it when calling KV Pool's get function, we just output a log indicating the get operation failed and keep going; hence, the accuracy of that specific request may be affected. We will handle this situation by falling back the request and re-compute everything assuming there's no prefix cache hit (or even better, revert only one block and keep using the Prefix Caches before that). diff --git a/docs/source/developer_guide/performance_and_debug/optimization_and_tuning.md b/docs/source/developer_guide/performance_and_debug/optimization_and_tuning.md index 953ec389..a0d06351 100644 --- a/docs/source/developer_guide/performance_and_debug/optimization_and_tuning.md +++ b/docs/source/developer_guide/performance_and_debug/optimization_and_tuning.md @@ -58,10 +58,10 @@ pip install modelscope pandas datasets gevent sacrebleu rouge_score pybind11 pyt VLLM_USE_MODELSCOPE=true ``` -Please follow the [Installation Guide](https://vllm-ascend.readthedocs.io/en/latest/installation.html) to make sure vLLM, vllm-ascend, and MindIE Turbo are installed correctly. +Please follow the [Installation Guide](https://vllm-ascend.readthedocs.io/en/latest/installation.html) to make sure vLLM and vllm-ascend are installed correctly. :::{note} -Make sure your vLLM and vllm-ascend are installed after your python configuration is completed, because these packages will build binary files using python in current environment. If you install vLLM, vllm-ascend, and MindIE Turbo before completing section 1.1, the binary files will not use the optimized python. +Make sure your vLLM and vllm-ascend are installed after your python configuration is completed, because these packages will build binary files using python in current environment. If you install vLLM and vllm-ascend before completing section 1.1, the binary files will not use the optimized python. ::: ## Optimizations