From c77dca54b2dec24a6f37368743999b662da616ef Mon Sep 17 00:00:00 2001 From: wangxiyuan Date: Wed, 10 Dec 2025 16:57:24 +0800 Subject: [PATCH] [CI] fix lint (#4888) Fix lint CI error Signed-off-by: wangxiyuan --- docs/source/tutorials/Qwen3-Dense.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/source/tutorials/Qwen3-Dense.md b/docs/source/tutorials/Qwen3-Dense.md index 6717bf11..395f9dce 100644 --- a/docs/source/tutorials/Qwen3-Dense.md +++ b/docs/source/tutorials/Qwen3-Dense.md @@ -45,6 +45,7 @@ You can using our official docker image for supporting Qwen3 Dense models. Currently, we provide the all-in-one images.[Download images](https://quay.io/repository/ascend/vllm-ascend?tab=tags) #### Docker Pull (by tag) + ```{code-block} bash :substitutions: @@ -53,6 +54,7 @@ docker pull quay.io/ascend/vllm-ascend:|vllm_ascend_version| ``` #### Docker run + ```{code-block} bash :substitutions: @@ -344,7 +346,7 @@ The configuration compilation_config = { "cudagraph_mode": "FULL_DECODE_ONLY"} i ### 8. Asynchronous Scheduling Asynchronous scheduling is a technique used to optimize inference efficiency. It allows non-blocking task scheduling to improve concurrency and throughput, especially when processing large-scale models. -This optimization is enabled by setting `--async-scheduling`. +This optimization is enabled by setting `--async-scheduling`. ## Optimization Highlights