Docs: Remove deprecated --task parameter for embedding models (#5257)

Fixes #3376 - Remove --task embed from vllm serve command in Qwen3_embedding.md - Remove task='embed' parameter from LLM constructor in Python example The --task parameter has been deprecated in recent vLLM versions in favor of automatic model type detection. - vLLM version: release/v0.13.0 - vLLM main: ad32e3e19c --------- Signed-off-by: hu-qi <huqi1024@gmail.com>
2025-12-30 16:09:07 +08:00
parent 71f729a661
commit c85cc045f8
1 changed files with 1 additions and 2 deletions
--- a/docs/source/tutorials/Qwen3_embedding.md
+++ b/docs/source/tutorials/Qwen3_embedding.md
@@ -30,7 +30,7 @@ Using the Qwen3-Embedding-8B model as an example, first run the docker container
 ### Online Inference

 ```bash
-vllm serve Qwen/Qwen3-Embedding-8B --task embed --host 127.0.0.1 --port 8888
+vllm serve Qwen/Qwen3-Embedding-8B --runner pooling --host 127.0.0.1 --port 8888
 ```

 Once your server is started, you can query the model with input prompts.
@@ -71,7 +71,6 @@ if __name__=="__main__":
    input_texts = queries + documents

    model = LLM(model="Qwen/Qwen3-Embedding-8B",
-                task="embed",
                distributed_executor_backend="mp")

    outputs = model.embed(input_texts)