[Docs] clean up structured outputs docs (#2654)

This commit is contained in:
Lianmin Zheng
2024-12-29 23:57:16 -08:00
committed by GitHub
parent e6f523b5f2
commit 8c3b420eec
10 changed files with 62 additions and 70 deletions

View File

@@ -159,10 +159,10 @@ python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instr
# Run 405B (fp16) on two nodes
## on the first node, replace the `172.16.4.52:20000` with your own first node ip address and port
GLOO_SOCKET_IFNAME=eth0 python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct --tp 16 --nccl-init-addr 172.16.4.52:20000 --nnodes 2 --node-rank 0 --disable-cuda-graph
python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct --tp 16 --nccl-init-addr 172.16.4.52:20000 --nnodes 2 --node-rank 0
## on the first node, replace the `172.16.4.52:20000` with your own first node ip address and port
GLOO_SOCKET_IFNAME=eth0 python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct --tp 16 --nccl-init-addr 172.16.4.52:20000 --nnodes 2 --node-rank 1 --disable-cuda-graph
python3 -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-405B-Instruct --tp 16 --nccl-init-addr 172.16.4.52:20000 --nnodes 2 --node-rank 1
```
</details>

View File

@@ -221,17 +221,15 @@
"metadata": {},
"source": [
"## Structured Outputs (JSON, Regex, EBNF)\n",
"You can specify a JSON schema, Regular Expression or [EBNF](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form) to constrain the model output. The model output will be guaranteed to follow the given constraints. \n",
"You can specify a JSON schema, [regular expression](https://en.wikipedia.org/wiki/Regular_expression) or [EBNF](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form) to constrain the model output. The model output will be guaranteed to follow the given constraints. Only one constraint parameter (`json_schema`, `regex`, or `ebnf`) can be specified for a request.\n",
"\n",
"SGLang supports two grammar backends:\n",
"\n",
"- [Outlines](https://github.com/dottxt-ai/outlines) (default): Supports JSON schema and Regular Expression constraints.\n",
"- [Outlines](https://github.com/dottxt-ai/outlines) (default): Supports JSON schema and regular expression constraints.\n",
"- [XGrammar](https://github.com/mlc-ai/xgrammar): Supports JSON schema and EBNF constraints.\n",
" - XGrammar currently uses the [GGML BNF format](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md)\n",
"\n",
"> 🔔 Only one constraint parameter (`json_schema`, `regex`, or `ebnf`) can be specified at a time.\n",
"\n",
"Initialise xgrammar backend using `--grammar-backend xgrammar` flag\n",
"Initialize the XGrammar backend using `--grammar-backend xgrammar` flag\n",
"```bash\n",
"python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-8B-Instruct \\\n",
"--port 30000 --host 0.0.0.0 --grammar-backend [xgrammar|outlines] # xgrammar or outlines (default: outlines)\n",

View File

@@ -11,20 +11,22 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"With SGLang, You can define a JSON schema, EBNF or regular expression to constrain the model's output.\n",
"## Structured Outputs (JSON, Regex, EBNF)\n",
"You can specify a JSON schema, [regular expression](https://en.wikipedia.org/wiki/Regular_expression) or [EBNF](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form) to constrain the model output. The model output will be guaranteed to follow the given constraints. Only one constraint parameter (`json_schema`, `regex`, or `ebnf`) can be specified for a request.\n",
"\n",
"[JSON Schema](https://json-schema.org/): Formats output into structured JSON objects with validation rules.\n",
"SGLang supports two grammar backends:\n",
"\n",
"[EBNF (Extended Backus-Naur Form)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form): Defines complex syntax rules, especially for recursive patterns like nested structures.\n",
"- [Outlines](https://github.com/dottxt-ai/outlines) (default): Supports JSON schema and regular expression constraints.\n",
"- [XGrammar](https://github.com/mlc-ai/xgrammar): Supports JSON schema and EBNF constraints.\n",
" - XGrammar currently uses the [GGML BNF format](https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md)\n",
"\n",
"[Regular Expressions](https://en.wikipedia.org/wiki/Regular_expression): Matches text patterns for simple validation and formatting.\n",
"Initialize the XGrammar backend using `--grammar-backend xgrammar` flag\n",
"```bash\n",
"python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-8B-Instruct \\\n",
"--port 30000 --host 0.0.0.0 --grammar-backend [xgrammar|outlines] # xgrammar or outlines (default: outlines)\n",
"```\n",
"\n",
"## Grammar Backend\n",
"\n",
"SGLang has two backends: [Outlines](https://github.com/dottxt-ai/outlines) (default) and [XGrammar](https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar). We suggest using XGrammar whenever possible for its better performance. For more details, see [XGrammar technical overview](https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar).\n",
"\n",
"* Xgrammar Backend: JSON and EBNF\n",
"* Outlines Backend: JSON and regular expressions"
"We suggest using XGrammar whenever possible for its better performance. For more details, see [XGrammar technical overview](https://blog.mlc.ai/2024/11/22/achieving-efficient-flexible-portable-structured-generation-with-xgrammar)."
]
},
{
@@ -208,15 +210,6 @@
"metadata": {},
"outputs": [],
"source": [
"from sglang.utils import (\n",
" execute_shell_command,\n",
" wait_for_server,\n",
" terminate_process,\n",
" print_highlight,\n",
")\n",
"\n",
"import requests\n",
"\n",
"server_process = execute_shell_command(\n",
" \"\"\"\n",
"python3 -m sglang.launch_server --model-path meta-llama/Llama-3.2-1B-Instruct --port=30010 --grammar-backend xgrammar\n",