[Docs] Update docs for gemma3 and VLM chat templates (#4674)
This commit is contained in:
committed by
GitHub
parent
321ab756bc
commit
fb8886037c
@@ -10,10 +10,13 @@
|
||||
"A complete reference for the API is available in the [OpenAI API Reference](https://platform.openai.com/docs/guides/vision).\n",
|
||||
"This tutorial covers the vision APIs for vision language models.\n",
|
||||
"\n",
|
||||
"SGLang supports vision language models such as Llama 3.2, LLaVA-OneVision, and QWen-VL2 \n",
|
||||
"SGLang supports various vision language models such as Llama 3.2, LLaVA-OneVision, Qwen2.5-VL, Gemma3 and [more](https://docs.sglang.ai/references/supported_models): \n",
|
||||
"- [meta-llama/Llama-3.2-11B-Vision-Instruct](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) \n",
|
||||
"- [lmms-lab/llava-onevision-qwen2-72b-ov-chat](https://huggingface.co/lmms-lab/llava-onevision-qwen2-72b-ov-chat) \n",
|
||||
"- [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) \n",
|
||||
"- [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)\n",
|
||||
"- [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it)\n",
|
||||
"- [openbmb/MiniCPM-V](https://huggingface.co/openbmb/MiniCPM-V)\n",
|
||||
"- [deepseek-ai/deepseek-vl2](https://huggingface.co/deepseek-ai/deepseek-vl2)\n",
|
||||
"\n",
|
||||
"As an alternative to the OpenAI API, you can also use the [SGLang offline engine](https://github.com/sgl-project/sglang/blob/main/examples/runtime/engine/offline_batch_inference_vlm.py)."
|
||||
]
|
||||
@@ -26,7 +29,7 @@
|
||||
"\n",
|
||||
"Launch the server in your terminal and wait for it to initialize.\n",
|
||||
"\n",
|
||||
"**Remember to add** `--chat-template llama_3_vision` **to specify the vision chat template, otherwise the server only supports text, and performance degradation may occur.**\n",
|
||||
"**Remember to add** `--chat-template llama_3_vision` **to specify the [vision chat template](https://docs.sglang.ai/backend/openai_api_vision.html#Chat-Template), otherwise, the server will only support text (images won’t be passed in), which can lead to degraded performance.**\n",
|
||||
"\n",
|
||||
"We need to specify `--chat-template` for vision language models because the chat template provided in Hugging Face tokenizer only supports text."
|
||||
]
|
||||
@@ -259,7 +262,10 @@
|
||||
"We list popular vision models with their chat templates:\n",
|
||||
"\n",
|
||||
"- [meta-llama/Llama-3.2-Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision-Instruct) uses `llama_3_vision`.\n",
|
||||
"- [Qwen/Qwen2-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct) uses `qwen2-vl`.\n",
|
||||
"- [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) uses `qwen2-vl`.\n",
|
||||
"- [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) uses `gemma-it`.\n",
|
||||
"- [openbmb/MiniCPM-V](https://huggingface.co/openbmb/MiniCPM-V) uses `minicpmv`.\n",
|
||||
"- [deepseek-ai/deepseek-vl2](https://huggingface.co/deepseek-ai/deepseek-vl2) uses `deepseek-vl2`.\n",
|
||||
"- [LlaVA-OneVision](https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov) uses `chatml-llava`.\n",
|
||||
"- [LLaVA-NeXT](https://huggingface.co/collections/lmms-lab/llava-next-6623288e2d61edba3ddbf5ff) uses `chatml-llava`.\n",
|
||||
"- [Llama3-LLaVA-NeXT](https://huggingface.co/lmms-lab/llama3-llava-next-8b) uses `llava_llama_3`.\n",
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
## Generative Models
|
||||
- Llama / Llama 2 / Llama 3 / Llama 3.1 / Llama 3.2 / Llama 3.3
|
||||
- Mistral / Mixtral / Mistral NeMo / Mistral Small 3
|
||||
- Gemma / Gemma 2
|
||||
- Gemma / Gemma 2 / Gemma3
|
||||
- Qwen / Qwen 2 / Qwen 2 MoE / Qwen 2 VL / Qwen 2.5 VL
|
||||
- DeepSeek / DeepSeek 2 / [DeepSeek 3](https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3)
|
||||
- OLMoE
|
||||
|
||||
Reference in New Issue
Block a user