Hongpeng Guo
|
949b3fbfce
|
[Doc] Update doc of custom logit processor (#3021)
Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>
|
2025-01-20 16:50:25 -08:00 |
|
Chayenne
|
2584f6d944
|
Docs: Add Performance Demonstaration for DPA (#3005)
|
2025-01-20 01:00:52 -08:00 |
|
Lianmin Zheng
|
03464890e0
|
Separate two entry points: Engine and HTTP server (#2996)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
|
2025-01-19 22:09:24 -08:00 |
|
Enrique Shockwave
|
3bcf5ecea7
|
support regex in xgrammar backend (#2983)
|
2025-01-20 04:34:41 +08:00 |
|
Yineng Zhang
|
def5c31873
|
docs: update supported_models (#2987)
|
2025-01-20 00:44:30 +08:00 |
|
Mick
|
3d93f84a00
|
[Feature] Support minicpmv v2.6 (#2785)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2025-01-18 14:14:19 -08:00 |
|
Wen Sun
|
120c3634ef
|
Fix Llama-3.1-405B References Docs (#2944)
|
2025-01-17 14:46:38 -08:00 |
|
Lianmin Zheng
|
8b6ce52e92
|
Support multi-node DP attention (#2925)
Co-authored-by: dhou-xai <dhou@x.ai>
|
2025-01-16 11:15:00 -08:00 |
|
Yineng Zhang
|
41d7e5b7e6
|
docs: update link (#2857)
|
2025-01-13 18:40:48 +08:00 |
|
Lianmin Zheng
|
72c7776355
|
Fix linear.py and improve weight loading (#2851)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
|
2025-01-13 01:39:14 -08:00 |
|
Shi Shuai
|
c4f9707e16
|
Improve: Token-In Token-Out Usage for RLHF (#2843)
|
2025-01-11 15:14:26 -08:00 |
|
Chayenne
|
5cc1170552
|
Doc: add block-wise FP8 in dpsk model reference (#2830)
|
2025-01-10 00:26:59 -08:00 |
|
Xiaotong Jiang
|
11fffbc95a
|
[Doc]: Deepseek reference docs (#2787)
|
2025-01-09 13:43:12 -08:00 |
|
Chayenne
|
2e6346fc2e
|
Docs:Update the style of llma 3.1 405B docs (#2789)
|
2025-01-08 01:07:54 -08:00 |
|
mlmz
|
977f785dad
|
Docs: Rewrite docs for LLama 405B and ModelSpace (#2773)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-08 00:02:59 -08:00 |
|
Lianmin Zheng
|
0f9cc6d8d3
|
Fix package loss for small models (#2717)
Co-authored-by: sdli1995 < mmlmonkey@163.com>
|
2025-01-02 18:25:26 -08:00 |
|
Shi Shuai
|
dd2e2d275f
|
Docs: Update documentation workflow and contribution guide (#2704)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-02 09:18:31 -08:00 |
|
Shi Shuai
|
062c48d2bd
|
[Docs] Add Support for Pydantic Structured Output Format (#2697)
|
2025-01-01 15:08:43 -08:00 |
|
Shi Shuai
|
0a765bbccc
|
Docs: Refactor Contribution Guide (#2690)
|
2024-12-31 14:11:00 -08:00 |
|
Lianmin Zheng
|
8c3b420eec
|
[Docs] clean up structured outputs docs (#2654)
|
2024-12-29 23:57:16 -08:00 |
|
Adarsh Shirawalmath
|
fd34f2da35
|
[Docs] Add EBNF to sampling params docs (#2609)
|
2024-12-29 00:05:00 -08:00 |
|
Lianmin Zheng
|
751e5ca273
|
[minor] clean up docs and eos id (#2622)
|
2024-12-27 11:23:46 -08:00 |
|
Lianmin Zheng
|
2125898af5
|
Update contributor_guide.md (#2603)
|
2024-12-26 08:36:13 -08:00 |
|
Lianmin Zheng
|
dc3bee4815
|
Fix test and benchmark scripts (#2598)
|
2024-12-26 07:56:26 -08:00 |
|
Lianmin Zheng
|
773951548d
|
Fix logprob_start_len for multi modal models (#2597)
Co-authored-by: libra <lihu723@gmail.com>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: Wang, Haoyu <haoyu.wang@intel.com>
|
2024-12-26 06:27:45 -08:00 |
|
Lianmin Zheng
|
8496701934
|
[Misc] Fix metrics, weight update lock, request logging (#2543)
|
2024-12-22 06:27:22 -08:00 |
|
Lianmin Zheng
|
21e9e63ad5
|
Print progress bar during cuda graph capture (#2502)
|
2024-12-17 06:33:46 -08:00 |
|
Fred Reiss
|
993956c6b1
|
Add support for IBM Granite 3.x models (#2437)
|
2024-12-11 06:30:23 -08:00 |
|
SangBin Cho
|
1f09e84b9a
|
nit: Remove busy waiting on scheduler (#2382)
|
2024-12-08 01:06:15 -08:00 |
|
Lianmin Zheng
|
0e7409adb6
|
Fix the overlap for xgrammar (#2377)
|
2024-12-06 05:49:29 -08:00 |
|
vchzls
|
3cde5eb629
|
docs: Improve instructions for supporting new models (#2363)
Co-authored-by: zhaohoulong <zhaohoulong@xiaomi.com>
|
2024-12-06 04:27:17 -08:00 |
|
bjmsong
|
91e5dbf554
|
add profile in offline benchmark & update doc (#2123)
Co-authored-by: root <bjmsong@126.com>
|
2024-11-27 14:57:13 -08:00 |
|
Rin Intachuen
|
1aea19f64b
|
Input_embeds support (#2052)
|
2024-11-25 16:35:04 -08:00 |
|
Lianmin Zheng
|
c211e7b669
|
Simplify batch update (#2154)
|
2024-11-24 04:47:10 -08:00 |
|
Lianmin Zheng
|
dfec7fca06
|
Rename sglang.bench_latency to sglang.bench_one_batch (#2118)
|
2024-11-21 20:07:48 -08:00 |
|
Tanjiro
|
8c280cee55
|
add phi-3 small support (#2062)
Co-authored-by: Tushar Goel <114812108+AI-Tushar@users.noreply.github.com>
|
2024-11-17 18:47:43 -08:00 |
|
Xiaoyu Zhang
|
023d0a73df
|
fix small typos in docs (#2047)
|
2024-11-15 11:09:10 -08:00 |
|
ws
|
29ebe3dff4
|
fix: align enable_overlap_scheduler naming between code and docs (#2038)
|
2024-11-15 03:39:10 -08:00 |
|
RangiLyu
|
f18b9c7252
|
support internlm2-reward (#1994)
|
2024-11-11 15:09:58 -08:00 |
|
aqweteddy
|
f16eb15d0d
|
Gemma2 reward model support (#1954)
|
2024-11-07 22:42:27 -08:00 |
|
Yudi Xue
|
5bc2508b80
|
Monitoring documentation (#1933)
|
2024-11-07 22:14:16 -08:00 |
|
Lianmin Zheng
|
1ae270c5d0
|
[Doc] fix docs (#1949)
|
2024-11-07 18:20:41 -08:00 |
|
Xuehai Pan
|
a5e0defb5a
|
minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926)
|
2024-11-06 13:46:04 +00:00 |
|
Lianmin Zheng
|
f5113e50ae
|
[Doc] improve relative links and structure (#1924)
|
2024-11-05 01:12:10 -08:00 |
|
Lianmin Zheng
|
1853c3523b
|
Fix regex docs (#1909)
|
2024-11-03 14:18:16 -08:00 |
|
Lianmin Zheng
|
838dcda162
|
Simplify tokenizer manager (#1899)
|
2024-11-03 03:52:38 -08:00 |
|
Lianmin Zheng
|
be7986e005
|
Fix docs (#1890)
|
2024-11-02 13:26:32 -07:00 |
|
Lianmin Zheng
|
7b394e5f2b
|
Fix docs (#1889)
|
2024-11-02 11:46:00 -07:00 |
|
Lianmin Zheng
|
2134f0898c
|
Fix links in the docs (#1878)
|
2024-11-01 18:25:55 -07:00 |
|
Lianmin Zheng
|
a54f278d44
|
Add a FAQ documentation (#1877)
|
2024-11-01 18:16:29 -07:00 |
|