[Doc] Update max_tokens to max_completion_tokens in all docs (#6248)

### What this PR does / why we need it? Fix: ``` DeprecationWarning: max_tokens is deprecated in favor of the max_completion_tokens field. ``` - vLLM version: v0.14.1 - vLLM main: d68209402d Signed-off-by: shen-shanshan <467638484@qq.com>
2026-01-26 11:57:40 +08:00
parent 418fccf0bc
commit e3eefdecbd
28 changed files with 43 additions and 43 deletions
--- a/docs/source/tutorials/Qwen3-Omni-30B-A3B-Thinking.md
+++ b/docs/source/tutorials/Qwen3-Omni-30B-A3B-Thinking.md
@@ -123,7 +123,7 @@ def main():
        temperature=0.6,
        top_p=0.95,
        top_k=20,
-        max_tokens=16384,
+        max_completion_tokens=16384,
    )

    processor = Qwen3OmniMoeProcessor.from_pretrained(MODEL_PATH)
@@ -243,7 +243,7 @@ evalscope eval \
    --datasets omni_bench, gsm8k, bbh \
    --dataset-args '{"omni_bench": { "extra_params": { "use_image": true, "use_audio": false}}}' \
    --eval-batch-size 1 \
-    --generation-config '{"max_tokens": 10000, "temperature": 0.6}' \
+    --generation-config '{"max_completion_tokens": 10000, "temperature": 0.6}' \
    --limit 100
 ```