diff --git a/docs/source/tutorials/Qwen3-Next.md b/docs/source/tutorials/Qwen3-Next.md
index 9bde7964..a0a7a6d1 100644
--- a/docs/source/tutorials/Qwen3-Next.md
+++ b/docs/source/tutorials/Qwen3-Next.md
@@ -19,6 +19,9 @@ Refer to [feature guide](../user_guide/feature_guide/index.md) to get the featur
  Download Link for the `Qwen3-Next-80B-A3B-Instruct` Model Weights: [Download model weight](https://modelers.cn/models/Modelers_Park/Qwen3-Next-80B-A3B-Instruct/tree/main)
 
 ## Deployment
+
+If the machine environment is an Atlas 800I A3(64G*16), the deployment approach stays identical.
+
 ### Run docker container
 
 ```{code-block} bash
@@ -92,7 +95,7 @@ Run the following script to start the vLLM server on multi-NPU:
 For an Atlas A2 with 64 GB of NPU card memory, tensor-parallel-size should be at least 4, and for 32 GB of memory, tensor-parallel-size should be at least 8.
 
 ```bash
-vllm serve Qwen/Qwen3-Next-80B-A3B-Instruct --tensor-parallel-size 4 --max-model-len 4096 --gpu-memory-utilization 0.85 --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY"}'
+vllm serve Qwen/Qwen3-Next-80B-A3B-Instruct --tensor-parallel-size 4 --max-model-len 4096 --gpu-memory-utilization 0.7 --compilation-config '{"cudagraph_mode":"FULL_DECODE_ONLY"}'
 ```
 
 Once your server is started, you can query the model with input prompts.