diff --git a/docs/source/tutorials/DeepSeek-V3.2.md b/docs/source/tutorials/DeepSeek-V3.2.md
index 199acfd6..30f8d907 100644
--- a/docs/source/tutorials/DeepSeek-V3.2.md
+++ b/docs/source/tutorials/DeepSeek-V3.2.md
@@ -454,10 +454,10 @@ Before you start, please
             --seed 1024 \
             --served-model-name dsv3 \
             --max-model-len 68000 \
-            --max-num-batched-tokens 4 \
-            --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY", "cudagraph_capture_sizes":[2, 4, 6, 8]}' \
+            --max-num-batched-tokens 12 \
+            --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY", "cudagraph_capture_sizes":[3, 6, 9, 12]}' \
             --trust-remote-code \
-            --max-num-seqs 1 \
+            --max-num-seqs 4 \
             --gpu-memory-utilization 0.95 \
             --no-enable-prefix-caching \
             --async-scheduling \
@@ -479,7 +479,8 @@ Before you start, please
                                 "tp_size": 4
                         }
                 }
-            }'
+            }' \
+            --additional-config '{"recompute_scheduler_enable" : true}'
         ```
 
     4. Decode node 1
@@ -532,11 +533,11 @@ Before you start, please
             --seed 1024 \
             --served-model-name dsv3 \
             --max-model-len 68000 \
-            --max-num-batched-tokens 4 \
-            --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY",  "cudagraph_capture_sizes":[2, 4, 6, 8]}' \
+            --max-num-batched-tokens 12 \
+            --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY",  "cudagraph_capture_sizes":[3, 6, 9, 12]}' \
             --trust-remote-code \
             --async-scheduling \
-            --max-num-seqs 1 \
+            --max-num-seqs 4 \
             --gpu-memory-utilization 0.95 \
             --no-enable-prefix-caching \
             --quantization ascend \
@@ -557,7 +558,8 @@ Before you start, please
                                 "tp_size": 4
                         }
                 }
-            }'
+            }' \
+            --additional-config '{"recompute_scheduler_enable" : true}'
         ```
 
 Once the preparation is done, you can start the server with the following command on each node:
@@ -639,6 +641,16 @@ lm_eval \
 
 Refer to [Using AISBench for performance evaluation](../developer_guide/evaluation/using_ais_bench.md#execute-performance-evaluation) for details.
 
+The performance result is:  
+
+**Hardware**: A3-752T, 4 node
+
+**Deployment**: 1P1D, Prefill node: DP2+TP16, Decode Node: DP8+TP4
+
+**Input/Output**: 64k/3k
+
+**Performance**: 533tps, TPOT 32ms
+
 ### Using vLLM Benchmark
 
 Run performance evaluation of `DeepSeek-V3.2-W8A8` as an example.
@@ -657,12 +669,8 @@ export VLLM_USE_MODELSCOPE=true
 vllm bench serve --model /root/.cache/Eco-Tech/DeepSeek-V3.2-w8a8-mtp-QuaRot  --dataset-name random --random-input 200 --num-prompt 200 --request-rate 1 --save-result --result-dir ./
 ```
 
-After about several minutes, you can get the performance evaluation result. With this tutorial, the performance result is:
+## Function Call
 
-**Hardware**: A3-752T, 4 node
+The function call feature is supported from v0.13.0rc1 on. Please use the latest version.
 
-**Deployment**: 1P1D, Prefill node: DP2+TP16, Decode Node: DP8+TP4
-
-**Input/Output**: 64k/3k
-
-**Performance**: 255tps, TPOT 23ms
+Refer to [DeepSeek-V3.2 Usage Guide](https://docs.vllm.ai/projects/recipes/en/latest/DeepSeek/DeepSeek-V3_2.html#tool-calling-example) for details.