[Doc] Update max_tokens to max_completion_tokens in all docs (#6248)

### What this PR does / why we need it? Fix: ``` DeprecationWarning: max_tokens is deprecated in favor of the max_completion_tokens field. ``` - vLLM version: v0.14.1 - vLLM main: d68209402d Signed-off-by: shen-shanshan <467638484@qq.com>
2026-01-26 11:57:40 +08:00
parent 418fccf0bc
commit e3eefdecbd
28 changed files with 43 additions and 43 deletions
--- a/docs/source/tutorials/Qwen3-Next.md
+++ b/docs/source/tutorials/Qwen3-Next.md
@@ -75,7 +75,7 @@ curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/jso
  "temperature": 0.6,
  "top_p": 0.95,
  "top_k": 20,
-  "max_tokens": 32
+  "max_completion_tokens": 32
 }'
 ```

@@ -103,7 +103,7 @@ if __name__ == '__main__':
    prompts = [
        "Who are you?",
    ]
-    sampling_params = SamplingParams(temperature=0.6, top_p=0.95, top_k=40, max_tokens=32)
+    sampling_params = SamplingParams(temperature=0.6, top_p=0.95, top_k=40, max_completion_tokens=32)
    llm = LLM(model="Qwen/Qwen3-Next-80B-A3B-Instruct",
              tensor_parallel_size=4,
              enforce_eager=True,