[Doc] fix the nit in docs (#6826)

Refresh the doc, fix the nit in the docs - vLLM version: v0.15.0 - vLLM main: 83b47f67b1 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
2026-02-27 11:50:27 +08:00
parent 981d803cb7
commit a95c0b8b82
30 changed files with 145 additions and 118 deletions
--- a/docs/source/developer_guide/feature_guide/disaggregated_prefill.md
+++ b/docs/source/developer_guide/feature_guide/disaggregated_prefill.md
@@ -7,8 +7,8 @@ This feature addresses the need to optimize the **Time Per Output Token (TPOT)**
 1. **Adjusting Parallel Strategy and Instance Count for P and D Nodes**  
   Using the disaggregated-prefill strategy, this feature allows the system to flexibly adjust the parallelization strategy (e.g., data parallelism (dp), tensor parallelism (tp), and expert parallelism (ep)) and the instance count for both P (Prefiller) and D (Decoder) nodes. This leads to better system performance tuning, particularly for **TTFT** and **TPOT**.

-2. **Optimizing TPOT**  
-   Without disaggregated-prefill strategy, prefill tasks are inserted during decoding, which results in inefficiencies and delays. disaggregated-prefill solves this by allowing for better control over the system’s **TPOT**. By managing chunked prefill tasks effectively, the system avoids the challenge of determining the optimal chunk size and provides more reliable control over the time taken for generating output tokens.
+2. **Optimizing TPOT**
+   Without the disaggregated-prefill strategy, prefill tasks are inserted during decoding, which results in inefficiencies and delays. Disaggregated-prefill solves this by allowing for better control over the system’s **TPOT**. By managing chunked prefill tasks effectively, the system avoids the challenge of determining the optimal chunk size and provides more reliable control over the time taken for generating output tokens.

 ---