Yineng Zhang
|
cf0f7eafe6
|
chore: bump v0.4.2.post1 (#3233)
|
2025-01-31 20:35:55 +08:00 |
|
Ravi Theja
|
9829e77e3f
|
Docs: Update supported models with Mistral 3 (#3229)
Co-authored-by: Ravi Theja Desetty <ravitheja@Ravis-MacBook-Pro.local>
|
2025-01-31 00:01:46 -08:00 |
|
Mick
|
9f635ea50d
|
[Fix] Address remaining issues of supporting MiniCPMV (#2977)
|
2025-01-28 00:22:13 -08:00 |
|
Jhin
|
7b9b4f4426
|
Docs fix about EAGLE and streaming output (#3166)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: Jhin <jhinpan@umich.edu>
|
2025-01-27 18:10:45 -08:00 |
|
Yineng Zhang
|
4ab43cfb3e
|
chore: bump v0.4.2 (#3180)
|
2025-01-27 21:42:05 +08:00 |
|
Jhin
|
9472e69963
|
Doc: Add Docs about EAGLE speculative decoding (#3144)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-01-26 17:49:13 -08:00 |
|
Chayenne
|
1acc1f561a
|
[Docs]: Add function calling in index.rst (#3155)
|
2025-01-26 11:11:27 -08:00 |
|
YAMY
|
b045841bae
|
Feature/function calling update (#2700)
Co-authored-by: Mingyuan Ma <mamingyuan2001@berkeley.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: shuaills <shishuaiuoe@gmail.com>
|
2025-01-26 09:57:51 -08:00 |
|
Adarsh Shirawalmath
|
4505a43614
|
[Docs] minor update for phi-3 and phi-4 (#3096)
|
2025-01-24 04:00:20 -08:00 |
|
simveit
|
1c4e0d2445
|
Docs: Update doc for server arguments (#2742)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-01-23 11:32:05 -08:00 |
|
Baizhou Zhang
|
b3393e941f
|
[Doc] Update doc of profiling with PyTorch Profiler (#3038)
|
2025-01-22 14:17:26 -08:00 |
|
Hongpeng Guo
|
949b3fbfce
|
[Doc] Update doc of custom logit processor (#3021)
Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>
|
2025-01-20 16:50:25 -08:00 |
|
Yineng Zhang
|
e94fb7cb10
|
chore: bump v0.4.1.post7 (#3009)
|
2025-01-20 21:50:55 +08:00 |
|
Chayenne
|
2584f6d944
|
Docs: Add Performance Demonstaration for DPA (#3005)
|
2025-01-20 01:00:52 -08:00 |
|
Lianmin Zheng
|
03464890e0
|
Separate two entry points: Engine and HTTP server (#2996)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
|
2025-01-19 22:09:24 -08:00 |
|
Chayenne
|
0ffcfdf474
|
Docs: Only use X-Grammar in structed output (#2991)
|
2025-01-19 20:22:47 -08:00 |
|
Enrique Shockwave
|
3bcf5ecea7
|
support regex in xgrammar backend (#2983)
|
2025-01-20 04:34:41 +08:00 |
|
Yineng Zhang
|
def5c31873
|
docs: update supported_models (#2987)
|
2025-01-20 00:44:30 +08:00 |
|
Mick
|
3d93f84a00
|
[Feature] Support minicpmv v2.6 (#2785)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2025-01-18 14:14:19 -08:00 |
|
Wen Sun
|
120c3634ef
|
Fix Llama-3.1-405B References Docs (#2944)
|
2025-01-17 14:46:38 -08:00 |
|
Lianmin Zheng
|
8b6ce52e92
|
Support multi-node DP attention (#2925)
Co-authored-by: dhou-xai <dhou@x.ai>
|
2025-01-16 11:15:00 -08:00 |
|
Yineng Zhang
|
b3e99dfb22
|
chore: bump v0.4.1.post6 (#2899)
|
2025-01-15 16:23:42 +08:00 |
|
Yineng Zhang
|
41d7e5b7e6
|
docs: update link (#2857)
|
2025-01-13 18:40:48 +08:00 |
|
Lianmin Zheng
|
72c7776355
|
Fix linear.py and improve weight loading (#2851)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
|
2025-01-13 01:39:14 -08:00 |
|
Shi Shuai
|
c4f9707e16
|
Improve: Token-In Token-Out Usage for RLHF (#2843)
|
2025-01-11 15:14:26 -08:00 |
|
Yineng Zhang
|
f624901cdd
|
chore: bump v0.4.1.post5 (#2840)
|
2025-01-11 23:10:02 +08:00 |
|
Chayenne
|
5cc1170552
|
Doc: add block-wise FP8 in dpsk model reference (#2830)
|
2025-01-10 00:26:59 -08:00 |
|
Xiaotong Jiang
|
11fffbc95a
|
[Doc]: Deepseek reference docs (#2787)
|
2025-01-09 13:43:12 -08:00 |
|
Chayenne
|
2e6346fc2e
|
Docs:Update the style of llma 3.1 405B docs (#2789)
|
2025-01-08 01:07:54 -08:00 |
|
mlmz
|
977f785dad
|
Docs: Rewrite docs for LLama 405B and ModelSpace (#2773)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-08 00:02:59 -08:00 |
|
Yineng Zhang
|
2f0d386496
|
chore: bump v0.4.1.post4 (#2713)
|
2025-01-06 01:29:54 +08:00 |
|
Lianmin Zheng
|
0f9cc6d8d3
|
Fix package loss for small models (#2717)
Co-authored-by: sdli1995 < mmlmonkey@163.com>
|
2025-01-02 18:25:26 -08:00 |
|
Shi Shuai
|
dd2e2d275f
|
Docs: Update documentation workflow and contribution guide (#2704)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-02 09:18:31 -08:00 |
|
Shi Shuai
|
062c48d2bd
|
[Docs] Add Support for Pydantic Structured Output Format (#2697)
|
2025-01-01 15:08:43 -08:00 |
|
Chayenne
|
0d8d97b8e6
|
Doc: Rename contribution_guide.md (#2691)
|
2024-12-31 14:35:48 -08:00 |
|
Shi Shuai
|
0a765bbccc
|
Docs: Refactor Contribution Guide (#2690)
|
2024-12-31 14:11:00 -08:00 |
|
Yineng Zhang
|
d49b13c6f8
|
feat: use CUDA 12.4 by default (for FA3) (#2682)
|
2024-12-31 15:52:09 +08:00 |
|
Lianmin Zheng
|
bdd2827a80
|
Update structured_outputs.ipynb (#2666)
|
2024-12-30 00:46:41 -08:00 |
|
Lianmin Zheng
|
8c3b420eec
|
[Docs] clean up structured outputs docs (#2654)
|
2024-12-29 23:57:16 -08:00 |
|
Yineng Zhang
|
098d659c0e
|
docs: update README (#2651)
|
2024-12-30 13:33:29 +08:00 |
|
Lianmin Zheng
|
03d5fbfd44
|
Release 0.4.1.post3 - upload the config.json to PyPI (#2647)
|
2024-12-29 14:25:53 -08:00 |
|
Yineng Zhang
|
b085e06b01
|
docs: add development guide using docker (#2645)
|
2024-12-30 02:22:54 +08:00 |
|
Yineng Zhang
|
3ccf566b0d
|
chore: bump v0.4.1.post2 (#2643)
|
2024-12-30 00:11:46 +08:00 |
|
Adarsh Shirawalmath
|
fd34f2da35
|
[Docs] Add EBNF to sampling params docs (#2609)
|
2024-12-29 00:05:00 -08:00 |
|
Tanjiro
|
8ee9a8501a
|
[Feature] Function Calling (#2544)
Co-authored-by: Haoyu Wang <120358163+HaoyuWang4188@users.noreply.github.com>
|
2024-12-28 21:58:52 -08:00 |
|
Shi Shuai
|
333e3bfde5
|
[docs]Refactor constrained decoding tutorial (#2633)
|
2024-12-28 07:00:38 -08:00 |
|
Shi Shuai
|
239c9d4d3a
|
Docs: Add constrained decoding tutorial (#2614)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2024-12-27 23:54:28 -08:00 |
|
Lianmin Zheng
|
751e5ca273
|
[minor] clean up docs and eos id (#2622)
|
2024-12-27 11:23:46 -08:00 |
|
Yineng Zhang
|
ef5b0ff90b
|
chore: bump v0.4.1.post1 (#2616)
|
2024-12-28 00:11:06 +08:00 |
|
Lianmin Zheng
|
2125898af5
|
Update contributor_guide.md (#2603)
|
2024-12-26 08:36:13 -08:00 |
|