sglang

Author	SHA1	Message	Date
yhyang201	a85363c199	[docs] Instructions for bench_serving.py (#9071 ) Co-authored-by: Mick <mickjagger19@icloud.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com> Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com> Co-authored-by: Yineng Zhang <me@zhyncs.com>	2025-08-26 18:30:57 -07:00
Lianmin Zheng	2e8e7e353b	Improve docs and developer guide (#9044 )	2025-08-10 21:05:18 -07:00
Lianmin Zheng	2449a0afe2	Refactor the docs (#9031 )	2025-08-10 19:49:45 -07:00
Lianmin Zheng	706bd69cc5	Clean up server_args.py to have a dedicated function for model specific adjustments (#8983 )	2025-08-08 19:56:50 -07:00
Kevin Xiang Li	44d600cd67	Support precomputed_embeddings for Llama 4 (#8156 ) Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com> Co-authored-by: Xiang (Kevin) Li <lik@nvidia.com> Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>	2025-07-27 01:14:49 -07:00
Lianmin Zheng	0f218731e3	Do not run frontend_reasoning.ipynb to reduce the CI load (#7073 )	2025-06-10 17:15:31 -07:00
Yudi Xue	14c18d25df	Frontend language separate reasoning support (#6031 )	2025-06-10 17:11:29 -07:00
Lianmin Zheng	bb185b0e92	Update README.md (#7040 )	2025-06-10 01:59:14 -07:00
Marc Sun	37f1547587	[FEAT] Add transformers backend support (#5929 )	2025-06-03 21:05:29 -07:00
linzhuo	7a0bbe6a64	update toc for doc and dockerfile code style format (#6450 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-05-27 13:05:11 +08:00
simveit	e235be16fe	Fix some issues with current docs. (#6588 )	2025-05-26 01:04:34 +08:00
Byron Hsu	7513558074	[PD] Add doc and simplify sender.send (#6019 )	2025-05-21 21:22:21 -07:00
Mick	cd7c8a8de6	doc: update developer guide regarding mllms (#6138 ) Signed-off-by: Xinyuan Tong <justinning0323@outlook.com> Co-authored-by: XinyuanTong <115166877+JustinTong0323@users.noreply.github.com> Co-authored-by: Xinyuan Tong <justinning0323@outlook.com>	2025-05-14 23:13:13 +08:00
Lianmin Zheng	e8e18dcdcc	Revert "fix some typos" (#6244 )	2025-05-12 12:53:26 -07:00
applesaucethebun	d738ab52f8	fix some typos (#6209 ) Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>	2025-05-13 01:42:38 +08:00
江家瑋	ad506a4e6b	docs: Fix Qwen model typo (#5944 ) Signed-off-by: JiangJiaWei1103 <waynechuang97@gmail.com>	2025-05-01 10:23:00 -07:00
Lianmin Zheng	155890e4d1	[Minor] fix documentations (#5756 )	2025-04-26 17:48:43 -07:00
Baizhou Zhang	072b4d0398	Add document for LoRA serving (#5521 )	2025-04-20 14:37:57 -07:00
mlmz	f13d65a7ea	Doc: fix problems of the 'Execute Notebooks / run-all-notebooks' ci caused by the unstability of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B (#5503 )	2025-04-17 11:37:43 -07:00
Ying Sheng	d7bc19a46a	add multi-lora feature in README.md (#5463 )	2025-04-16 03:25:25 -07:00
mRSun15	3efc8e2d2a	add attention backend supporting matrix in the doc (#5211 ) Co-authored-by: Stefan He <hebiaobuaa@gmail.com>	2025-04-15 17:16:34 -07:00
Adarsh Shirawalmath	4aa6bab0b0	[Docs] Supported Model Docs - Major restructuring (#5290 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-04-11 09:17:47 -07:00
mlmz	7c5658c189	feat: disable grammar restrictions within reasoning sections (#4984 ) Co-authored-by: tianhaoyu <thy@mail.ecust.edu.cn> Co-authored-by: DarkSharpness <2040703891@qq.com>	2025-04-07 21:46:47 -07:00
Ke Bao	ade714a67f	Add Llama4 user guide (#5133 ) Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>	2025-04-07 19:09:34 -07:00
Lianmin Zheng	c38ca4fc8e	Update readme (#4517 )	2025-03-17 08:22:42 -07:00
Yineng Zhang	00f42707ea	update doc (#4299 )	2025-03-11 01:14:16 -07:00
Chayenne	e70fa279bc	Docs: reorganize dpsk docs (#4108 )	2025-03-05 13:01:03 -08:00
Tommy Yang	abe74b7b59	Docs: Add DeepSeek optimization ablations documentation (#4107 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-03-05 12:25:51 -08:00
Xihuai Wang	95575aa76a	Reasoning parser (#4000 ) Co-authored-by: Lucas Pickup <lupickup@microsoft.com>	2025-03-03 21:16:36 -08:00
simveit	acd1a15921	Docs: Implemented frontend docs (#3791 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-02-26 15:30:05 -08:00
Chayenne	8f019c7d1a	Docs: Move dpsk docs forward a step (#3894 )	2025-02-26 11:43:20 -08:00
Chayenne	3c7bfd7eab	Docs: Fix layout with sub-section (#3710 )	2025-02-19 15:44:30 -08:00
Shi Shuai	55de40f782	[Docs]: Fix Multi-User Port Allocation Conflicts (#3601 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com> Co-authored-by: simveit <simp.veitner@gmail.com>	2025-02-19 11:15:44 -08:00
ybyang	c51dc2cc8d	Docs: Deploy multi-node inference (LWS method) using sglang in a K8s cluster (#3624 )	2025-02-17 18:14:20 -08:00
Shi Shuai	7443197a63	[CI] Improve Docs CI Efficiency (#3587 ) Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-02-14 19:57:00 -08:00
Wenxuan Tan	0af1d239cb	[Docs] Add quantization docs (#3410 ) Co-authored-by: yinfan98 <1106310035@qq.com>	2025-02-10 02:16:21 +08:00
Zachary Streeter	0a6f18f068	added amd_configure.md to references (#3275 ) Co-authored-by: HAI <hixiao@gmail.com> Co-authored-by: Yineng Zhang <me@zhyncs.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-02-07 08:50:49 -08:00
Chayenne	76ca91dff2	Docs/CI: Enable Fake Finish for Docs Only PR (#3350 )	2025-02-06 19:33:31 -08:00
Liangjun Song	455bfe8dd3	Add a Doc about guide on nvidia jetson #3182 (#3205 ) Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-02-02 20:29:10 -08:00
simveit	c27c378a19	docs/accuracy evaluation (#3114 ) Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-02-02 11:01:39 -08:00
Jhin	9472e69963	Doc: Add Docs about EAGLE speculative decoding (#3144 ) Co-authored-by: Chayenne <zhaochenyang@ucla.edu> Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>	2025-01-26 17:49:13 -08:00
Chayenne	1acc1f561a	[Docs]: Add function calling in index.rst (#3155 )	2025-01-26 11:11:27 -08:00
Shi Shuai	c4f9707e16	Improve: Token-In Token-Out Usage for RLHF (#2843 )	2025-01-11 15:14:26 -08:00
Xiaotong Jiang	11fffbc95a	[Doc]: Deepseek reference docs (#2787 )	2025-01-09 13:43:12 -08:00
Chayenne	2e6346fc2e	Docs：Update the style of llma 3.1 405B docs (#2789 )	2025-01-08 01:07:54 -08:00
mlmz	977f785dad	Docs: Rewrite docs for LLama 405B and ModelSpace (#2773 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-01-08 00:02:59 -08:00
Shi Shuai	062c48d2bd	[Docs] Add Support for Pydantic Structured Output Format (#2697 )	2025-01-01 15:08:43 -08:00
Chayenne	0d8d97b8e6	Doc: Rename contribution_guide.md (#2691 )	2024-12-31 14:35:48 -08:00
Lianmin Zheng	bdd2827a80	Update structured_outputs.ipynb (#2666 )	2024-12-30 00:46:41 -08:00
Shi Shuai	239c9d4d3a	Docs: Add constrained decoding tutorial (#2614 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2024-12-27 23:54:28 -08:00

1 2

68 Commits