sglang

Author	SHA1	Message	Date
Hongpeng Guo	949b3fbfce	[Doc] Update doc of custom logit processor (#3021 ) Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>	2025-01-20 16:50:25 -08:00
Chayenne	2584f6d944	Docs: Add Performance Demonstaration for DPA (#3005 )	2025-01-20 01:00:52 -08:00
Lianmin Zheng	03464890e0	Separate two entry points: Engine and HTTP server (#2996 ) Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>	2025-01-19 22:09:24 -08:00
Enrique Shockwave	3bcf5ecea7	support regex in xgrammar backend (#2983 )	2025-01-20 04:34:41 +08:00
Yineng Zhang	def5c31873	docs: update supported_models (#2987 )	2025-01-20 00:44:30 +08:00
Mick	3d93f84a00	[Feature] Support minicpmv v2.6 (#2785 ) Co-authored-by: Chayenne <zhaochen20@outlook.com> Co-authored-by: yizhang2077 <1109276519@qq.com>	2025-01-18 14:14:19 -08:00
Wen Sun	120c3634ef	Fix Llama-3.1-405B References Docs (#2944 )	2025-01-17 14:46:38 -08:00
Lianmin Zheng	8b6ce52e92	Support multi-node DP attention (#2925 ) Co-authored-by: dhou-xai <dhou@x.ai>	2025-01-16 11:15:00 -08:00
Yineng Zhang	41d7e5b7e6	docs: update link (#2857 )	2025-01-13 18:40:48 +08:00
Lianmin Zheng	72c7776355	Fix linear.py and improve weight loading (#2851 ) Co-authored-by: SangBin Cho <rkooo567@gmail.com>	2025-01-13 01:39:14 -08:00
Shi Shuai	c4f9707e16	Improve: Token-In Token-Out Usage for RLHF (#2843 )	2025-01-11 15:14:26 -08:00
Chayenne	5cc1170552	Doc: add block-wise FP8 in dpsk model reference (#2830 )	2025-01-10 00:26:59 -08:00
Xiaotong Jiang	11fffbc95a	[Doc]: Deepseek reference docs (#2787 )	2025-01-09 13:43:12 -08:00
Chayenne	2e6346fc2e	Docs：Update the style of llma 3.1 405B docs (#2789 )	2025-01-08 01:07:54 -08:00
mlmz	977f785dad	Docs: Rewrite docs for LLama 405B and ModelSpace (#2773 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-01-08 00:02:59 -08:00
Lianmin Zheng	0f9cc6d8d3	Fix package loss for small models (#2717 ) Co-authored-by: sdli1995 < mmlmonkey@163.com>	2025-01-02 18:25:26 -08:00
Shi Shuai	dd2e2d275f	Docs: Update documentation workflow and contribution guide (#2704 ) Co-authored-by: Chayenne <zhaochen20@outlook.com>	2025-01-02 09:18:31 -08:00
Shi Shuai	062c48d2bd	[Docs] Add Support for Pydantic Structured Output Format (#2697 )	2025-01-01 15:08:43 -08:00
Shi Shuai	0a765bbccc	Docs: Refactor Contribution Guide (#2690 )	2024-12-31 14:11:00 -08:00
Lianmin Zheng	8c3b420eec	[Docs] clean up structured outputs docs (#2654 )	2024-12-29 23:57:16 -08:00
Adarsh Shirawalmath	fd34f2da35	[Docs] Add EBNF to sampling params docs (#2609 )	2024-12-29 00:05:00 -08:00
Lianmin Zheng	751e5ca273	[minor] clean up docs and eos id (#2622 )	2024-12-27 11:23:46 -08:00
Lianmin Zheng	2125898af5	Update contributor_guide.md (#2603 )	2024-12-26 08:36:13 -08:00
Lianmin Zheng	dc3bee4815	Fix test and benchmark scripts (#2598 )	2024-12-26 07:56:26 -08:00
Lianmin Zheng	773951548d	Fix logprob_start_len for multi modal models (#2597 ) Co-authored-by: libra <lihu723@gmail.com> Co-authored-by: fzyzcjy <ch271828n@outlook.com> Co-authored-by: Wang, Haoyu <haoyu.wang@intel.com>	2024-12-26 06:27:45 -08:00
Lianmin Zheng	8496701934	[Misc] Fix metrics, weight update lock, request logging (#2543 )	2024-12-22 06:27:22 -08:00
Lianmin Zheng	21e9e63ad5	Print progress bar during cuda graph capture (#2502 )	2024-12-17 06:33:46 -08:00
Fred Reiss	993956c6b1	Add support for IBM Granite 3.x models (#2437 )	2024-12-11 06:30:23 -08:00
SangBin Cho	1f09e84b9a	nit: Remove busy waiting on scheduler (#2382 )	2024-12-08 01:06:15 -08:00
Lianmin Zheng	0e7409adb6	Fix the overlap for xgrammar (#2377 )	2024-12-06 05:49:29 -08:00
vchzls	3cde5eb629	docs: Improve instructions for supporting new models (#2363 ) Co-authored-by: zhaohoulong <zhaohoulong@xiaomi.com>	2024-12-06 04:27:17 -08:00
bjmsong	91e5dbf554	add profile in offline benchmark & update doc (#2123 ) Co-authored-by: root <bjmsong@126.com>	2024-11-27 14:57:13 -08:00
Rin Intachuen	1aea19f64b	Input_embeds support (#2052 )	2024-11-25 16:35:04 -08:00
Lianmin Zheng	c211e7b669	Simplify batch update (#2154 )	2024-11-24 04:47:10 -08:00
Lianmin Zheng	dfec7fca06	Rename sglang.bench_latency to sglang.bench_one_batch (#2118 )	2024-11-21 20:07:48 -08:00
Tanjiro	8c280cee55	add phi-3 small support (#2062 ) Co-authored-by: Tushar Goel <114812108+AI-Tushar@users.noreply.github.com>	2024-11-17 18:47:43 -08:00
Xiaoyu Zhang	023d0a73df	fix small typos in docs (#2047 )	2024-11-15 11:09:10 -08:00
ws	29ebe3dff4	fix: align enable_overlap_scheduler naming between code and docs (#2038 )	2024-11-15 03:39:10 -08:00
RangiLyu	f18b9c7252	support internlm2-reward (#1994 )	2024-11-11 15:09:58 -08:00
aqweteddy	f16eb15d0d	Gemma2 reward model support (#1954 )	2024-11-07 22:42:27 -08:00
Yudi Xue	5bc2508b80	Monitoring documentation (#1933 )	2024-11-07 22:14:16 -08:00
Lianmin Zheng	1ae270c5d0	[Doc] fix docs (#1949 )	2024-11-07 18:20:41 -08:00
Xuehai Pan	a5e0defb5a	minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926 )	2024-11-06 13:46:04 +00:00
Lianmin Zheng	f5113e50ae	[Doc] improve relative links and structure (#1924 )	2024-11-05 01:12:10 -08:00
Lianmin Zheng	1853c3523b	Fix regex docs (#1909 )	2024-11-03 14:18:16 -08:00
Lianmin Zheng	838dcda162	Simplify tokenizer manager (#1899 )	2024-11-03 03:52:38 -08:00
Lianmin Zheng	be7986e005	Fix docs (#1890 )	2024-11-02 13:26:32 -07:00
Lianmin Zheng	7b394e5f2b	Fix docs (#1889 )	2024-11-02 11:46:00 -07:00
Lianmin Zheng	2134f0898c	Fix links in the docs (#1878 )	2024-11-01 18:25:55 -07:00
Lianmin Zheng	a54f278d44	Add a FAQ documentation (#1877 )	2024-11-01 18:16:29 -07:00

1 2

52 Commits