[Doc] Add note for unsupported PCP + FULL (#7559)

### What this PR does / why we need it? This PR aims to add note in doc that FULL mode is not supported in PCP scenario. Signed-off-by: Zetong Li <slippersss@126.com>
2026-03-23 17:34:51 +08:00
parent 9976e685b7
commit a253235a59
1 changed files with 4 additions and 0 deletions
--- a/docs/source/user_guide/feature_guide/graph_mode.md
+++ b/docs/source/user_guide/feature_guide/graph_mode.md
@@ -4,6 +4,10 @@
 This feature is currently experimental. In future versions, there may be behavioral changes around configuration, coverage, performance improvement.
 ```

+```{note}
+In context parallel scenario (i.e. prefill_context_parallel_size * decode_context_parallel_size > 1), "cudagraph_mode" is not sufficiently supported to be set to "FULL" yet.
+```
+
 This guide provides instructions for using Ascend Graph Mode with vLLM Ascend. Please note that graph mode is only available on V1 Engine. And only Qwen, DeepSeek series models are well tested from 0.9.0rc1. We will make it stable and generalized in the next release.

 ## Getting Started