[Doc][Misc] Correcting the document and uploading the model deployment template (#8287)

### What this PR does / why we need it? Correcting the document and uploading the model deployment template ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? --------- Signed-off-by: herizhen <1270637059@qq.com> Signed-off-by: herizhen <59841270+herizhen@users.noreply.github.com>
2026-04-15 16:03:11 +08:00
parent 147b589f62
commit 95726d20eb
31 changed files with 536 additions and 308 deletions
--- a/docs/source/developer_guide/performance_and_debug/msprobe_guide.md
+++ b/docs/source/developer_guide/performance_and_debug/msprobe_guide.md
@@ -332,7 +332,6 @@ An L0 `dump.json` contains forward I/O for modules together with parameters. Usi
     "data_name": "Module.conv2.Conv2d.forward.0.parameters.bias.pt"
    }
   }
-  },
  }
 }
 }
@@ -389,7 +388,6 @@ An L1 `dump.json` records forward I/O for APIs. Using PyTorch's `relu` function
     "data_name": "Functional.relu.0.forward.output.0.pt"
    }
   ]
-  },
  }
 }
 }  
--- a/docs/source/developer_guide/performance_and_debug/optimization_and_tuning.md
+++ b/docs/source/developer_guide/performance_and_debug/optimization_and_tuning.md
@@ -111,7 +111,7 @@ sudo apt update
 sudo apt install libjemalloc2

 # Configure jemalloc
-export LD_PRELOAD=/usr/lib/"$(uname -i)"-linux-gnu/libjemalloc.so.2 $LD_PRELOAD
+export LD_PRELOAD=/usr/lib/"$(uname -i)"-linux-gnu/libjemalloc.so.2:$LD_PRELOAD
 ```

 #### 2.2. Tcmalloc
--- a/docs/source/developer_guide/performance_and_debug/performance_benchmark.md
+++ b/docs/source/developer_guide/performance_and_debug/performance_benchmark.md
@@ -97,7 +97,8 @@ For local `dataset-path`, please set `hf-name` to its Hugging Face ID like
 First start serving your model:

 ```bash
-VLLM_USE_MODELSCOPE=True vllm serve Qwen/Qwen3-8B
+export VLLM_USE_MODELSCOPE=True 
+vllm serve Qwen/Qwen3-8B
 ```

 Then run the benchmarking script:
@@ -158,7 +159,7 @@ vllm bench throughput \
 If successful, you will see the following output

 ```shell
-Processed prompts: 100%|█| 10/10 [00:03<00:00,  2.74it/s, est. speed input: 351.02 toks/s, output: 351.02 t
+Processed prompts: 100%|█| 10/10 [00:03<00:00,  2.74it/s, est. speed input: 351.02 toks/s, output: 351.02 toks/s
 Throughput: 2.73 requests/s, 699.93 total tokens/s, 349.97 output tokens/s
 Total num prompt tokens:  1280
 Total num output tokens:  1280