[Lint]Style: reformat markdown files via markdownlint (#5884)

### What this PR does / why we need it? reformat markdown files via markdownlint - vLLM version: v0.13.0 - vLLM main: bde38c11df --------- Signed-off-by: root <root@LAPTOP-VQKDDVMG.localdomain> Signed-off-by: MrZ20 <2609716663@qq.com> Co-authored-by: root <root@LAPTOP-VQKDDVMG.localdomain>
2026-01-15 09:06:01 +08:00
parent 96edd4673f
commit 4811ba62e0
75 changed files with 711 additions and 308 deletions
--- a/docs/source/developer_guide/feature_guide/disaggregated_prefill.md
+++ b/docs/source/developer_guide/feature_guide/disaggregated_prefill.md
@@ -15,6 +15,7 @@ This feature addresses the need to optimize the **Time Per Output Token (TPOT)**
 ## Usage

 vLLM Ascend currently supports two types of connectors for handling KV cache management:  
+
 - **MooncakeConnector**: D nodes pull KV cache from P nodes.
 - **MooncakeLayerwiseConnector**: P nodes push KV cache to D nodes in a layered manner.  

@@ -35,7 +36,7 @@ Our design diagram is shown below, illustrating the pull and push schemes respec
 ![alt text](../../assets/disaggregated_prefill_pull.png)
 ![alt text](../../assets/disaggregated_prefill_push.png)

-#### Mooncake Connector:
+#### Mooncake Connector

 1. The request is sent to the Proxy’s `_handle_completions` endpoint.
 2. The Proxy calls `select_prefiller` to choose a P node and forwards the request, configuring `kv_transfer_params` with `do_remote_decode=True`, `max_tokens=1`, and `min_tokens=1`.
@@ -43,7 +44,7 @@ Our design diagram is shown below, illustrating the pull and push schemes respec
 4. The Proxy calls `select_decoder` to choose a D node and forwards the request.
 5. On the D node, the scheduler marks the request as `RequestStatus.WAITING_FOR_REMOTE_KVS`, pre-allocates KV cache, calls `kv_connector_no_forward` to pull the remote KV cache, then notifies the P node to release KV cache and proceeds with decoding to return the result.

-#### Mooncake Layerwise Connector:
+#### Mooncake Layerwise Connector

 1. The request is sent to the Proxy’s `_handle_completions` endpoint.
 2. The Proxy calls `select_decoder` to choose a D node and forwards the request, configuring `kv_transfer_params` with `do_remote_prefill=True` and setting the `metaserver` endpoint.
@@ -55,6 +56,7 @@ Our design diagram is shown below, illustrating the pull and push schemes respec
 ### 3. Interface Design

 Taking MooncakeConnector as an example, the system is organized into three primary classes:
+
 - **MooncakeConnector**: Base class that provides core interfaces.
 - **MooncakeConnectorScheduler**: Interface for scheduling the connectors within the engine core, responsible for managing KV cache transfer requirements and completion.
 - **MooncakeConnectorWorker**: Interface for managing KV cache registration and transfer in worker processes.