Fix and Clean up chat-template requirement for VLM (#6114)

Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-05-10 09:14:09 -07:00
parent c178abdabc
commit 9d8ec2e67e
16 changed files with 104 additions and 195 deletions
--- a/docs/backend/openai_api_vision.ipynb
+++ b/docs/backend/openai_api_vision.ipynb
@@ -27,11 +27,7 @@
   "source": [
    "## Launch A Server\n",
    "\n",
-    "Launch the server in your terminal and wait for it to initialize.\n",
-    "\n",
-    "**Remember to add** `--chat-template` **for example** `--chat-template=qwen2-vl` **to specify the [vision chat template](https://docs.sglang.ai/backend/openai_api_vision.html#Chat-Template), otherwise, the server will only support text (images won’t be passed in), which can lead to degraded performance.**\n",
-    "\n",
-    "We need to specify `--chat-template` for vision language models because the chat template provided in Hugging Face tokenizer only supports text."
+    "Launch the server in your terminal and wait for it to initialize."
   ]
  },
  {
@@ -51,8 +47,7 @@
    "\n",
    "vision_process, port = launch_server_cmd(\n",
    "    \"\"\"\n",
-    "python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-VL-7B-Instruct \\\n",
-    "    --chat-template=qwen2-vl\n",
+    "python3 -m sglang.launch_server --model-path Qwen/Qwen2.5-VL-7B-Instruct\n",
    "\"\"\"\n",
    ")\n",
    "\n",
@@ -250,27 +245,6 @@
   "source": [
    "terminate_process(vision_process)"
   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Chat Template\n",
-    "\n",
-    "As mentioned before, if you do not specify a vision model's `--chat-template`, the server uses Hugging Face's default template, which only supports text.\n",
-    "\n",
-    "We list popular vision models with their chat templates:\n",
-    "\n",
-    "- [meta-llama/Llama-3.2-Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) uses `llama_3_vision`.\n",
-    "- [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) uses `qwen2-vl`.\n",
-    "- [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) uses `gemma-it`.\n",
-    "- [openbmb/MiniCPM-V](https://huggingface.co/openbmb/MiniCPM-V) uses `minicpmv`.\n",
-    "- [deepseek-ai/deepseek-vl2](https://huggingface.co/deepseek-ai/deepseek-vl2) uses `deepseek-vl2`.\n",
-    "- [LlaVA-OneVision](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) uses `chatml-llava`.\n",
-    "- [LLaVA-NeXT](https://huggingface.co/collections/lmms-lab/llava-next-6623288e2d61edba3ddbf5ff) uses `chatml-llava`.\n",
-    "- [Llama3-LLaVA-NeXT](https://huggingface.co/lmms-lab/llama3-llava-next-8b) uses `llava_llama_3`.\n",
-    "- [LLaVA-v1.5 / 1.6](https://huggingface.co/liuhaotian/llava-v1.6-34b) uses `vicuna_v1.1`."
-   ]
  }
 ],
 "metadata": {
--- a/docs/backend/sampling_params.md
+++ b/docs/backend/sampling_params.md
@@ -136,7 +136,7 @@ Detailed example in [openai compatible api](https://docs.sglang.ai/backend/opena
 Launch a server:

 ```bash
-python3 -m sglang.launch_server --model-path lmms-lab/llava-onevision-qwen2-7b-ov --chat-template chatml-llava
+python3 -m sglang.launch_server --model-path lmms-lab/llava-onevision-qwen2-7b-ov
 ```

 Download an image: