[CI] Accuracy issue of qwen3-next-w8a8 nightly test fix. (#5746)

### What this PR does / why we need it? Close the **Full Graph** mode to temporarily avoid accuracy issue for **Qwen3-Next-80B-A3B-Instruct-W8A8**. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? - vLLM version: v0.13.0 - vLLM main: 2f4e6548ef --------- Signed-off-by: InSec <1790766300@qq.com>
2026-01-09 15:55:13 +08:00
parent be941cab71
commit 2d713fee93
1 changed files with 1 additions and 1 deletions
--- a/tests/e2e/nightly/single_node/models/test_qwen3_next_w8a8.py
+++ b/tests/e2e/nightly/single_node/models/test_qwen3_next_w8a8.py
@@ -78,7 +78,7 @@ async def test_models(model: str) -> None:
        "--gpu-memory-utilization",
        "0.65",
        "--compilation-config",
-        '{"cudagraph_capture_sizes": [32], "cudagraph_mode":"FULL_DECODE_ONLY"}',
+        '{"cudagraph_capture_sizes": [32]}',
    ]
    request_keyword_args: dict[str, Any] = {
        **api_keyword_args,