[FEATURE] Add OpenAI-Compatible LoRA Adapter Selection (#11570)
This commit is contained in:
@@ -361,6 +361,50 @@
|
||||
"For OpenAI compatible structured outputs API, refer to [Structured Outputs](../advanced_features/structured_outputs.ipynb) for more details.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Using LoRA Adapters\n",
|
||||
"\n",
|
||||
"SGLang supports LoRA (Low-Rank Adaptation) adapters with OpenAI-compatible APIs. You can specify which adapter to use directly in the `model` parameter using the `base-model:adapter-name` syntax.\n",
|
||||
"\n",
|
||||
"**Server Setup:**\n",
|
||||
"```bash\n",
|
||||
"python -m sglang.launch_server \\\n",
|
||||
" --model-path qwen/qwen2.5-0.5b-instruct \\\n",
|
||||
" --enable-lora \\\n",
|
||||
" --lora-paths adapter_a=/path/to/adapter_a adapter_b=/path/to/adapter_b\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"For more details on LoRA serving configuration, see the [LoRA documentation](../advanced_features/lora.ipynb).\n",
|
||||
"\n",
|
||||
"**API Call:**\n",
|
||||
"\n",
|
||||
"(Recommended) Use the `model:adapter` syntax to specify which adapter to use:\n",
|
||||
"```python\n",
|
||||
"response = client.chat.completions.create(\n",
|
||||
" model=\"qwen/qwen2.5-0.5b-instruct:adapter_a\", # ← base-model:adapter-name\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": \"Convert to SQL: show all users\"}],\n",
|
||||
" max_tokens=50,\n",
|
||||
")\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**Backward Compatible: Using `extra_body`**\n",
|
||||
"\n",
|
||||
"The old `extra_body` method is still supported for backward compatibility:\n",
|
||||
"```python\n",
|
||||
"# Backward compatible method\n",
|
||||
"response = client.chat.completions.create(\n",
|
||||
" model=\"qwen/qwen2.5-0.5b-instruct\",\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": \"Convert to SQL: show all users\"}],\n",
|
||||
" extra_body={\"lora_path\": \"adapter_a\"}, # ← old method\n",
|
||||
" max_tokens=50,\n",
|
||||
")\n",
|
||||
"```\n",
|
||||
"**Note:** When both `model:adapter` and `extra_body[\"lora_path\"]` are specified, the `model:adapter` syntax takes precedence."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
|
||||
Reference in New Issue
Block a user