[FEATURE] Add OpenAI-Compatible LoRA Adapter Selection (#11570)

This commit is contained in:
Neelabh Sinha
2025-10-21 00:44:33 -07:00
committed by GitHub
parent 7e6191c098
commit 852c0578fd
10 changed files with 815 additions and 40 deletions

View File

@@ -59,6 +59,17 @@
"### Serving Single Adaptor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note:** SGLang supports LoRA adapters through two APIs:\n",
"\n",
"1. **OpenAI-Compatible API** (`/v1/chat/completions`, `/v1/completions`): Use the `model:adapter-name` syntax. See [OpenAI API with LoRA](../basic_usage/openai_api_completions.ipynb#Using-LoRA-Adapters) for examples.\n",
"\n",
"2. **Native API** (`/generate`): Pass `lora_path` in the request body (shown below)."
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -379,6 +390,15 @@
"print(f\"Output from lora1 (updated): \\n{response.json()[1]['text']}\\n\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### OpenAI-compatible API usage\n",
"\n",
"You can use LoRA adapters via the OpenAI-compatible APIs by specifying the adapter in the `model` field using the `base-model:adapter-name` syntax (for example, `qwen/qwen2.5-0.5b-instruct:adapter_a`). For more details and examples, see the “Using LoRA Adapters” section in the OpenAI API documentation: [openai_api_completions.ipynb](../basic_usage/openai_api_completions.ipynb).\n"
]
},
{
"cell_type": "code",
"execution_count": null,