Commit Graph

299 Commits

Author SHA1 Message Date
Yineng Zhang
066f4ec91f chore: bump v0.4.9.post1 (#7882) 2025-07-09 00:28:17 -07:00
ybyang
4986104618 Bump xgrammar's version to 0.1.20 (#7866) 2025-07-08 17:55:30 -07:00
Brayden Zhong
a37e1247c1 [Multimodal][Perf] Use pybase64 instead of base64 (#7724) 2025-07-08 14:00:58 -07:00
Yineng Zhang
ec5f9c6269 chore: bump v0.4.9 (#7802) 2025-07-05 17:40:29 -07:00
Yineng Zhang
62f5522ffe chore: upgrade sgl-kernel v0.2.4 (#7801) 2025-07-05 17:37:40 -07:00
Yineng Zhang
77cfea689d chore: upgrade sgl-kernel v0.2.3 (#7786) 2025-07-05 01:55:55 -07:00
Yi Zhang
489934be0a fuse renormal into moe topk softmax kernel python code (#7751)
Co-authored-by: ispobock <ispobaoke@gmail.com>
Co-authored-by: zhyncs <me@zhyncs.com>
2025-07-03 16:22:14 -07:00
Yineng Zhang
f18a8fddd4 chore: upgrade flashinfer v0.2.7.post1 (#7698) 2025-07-01 14:05:57 -07:00
Zhiqiang Xie
f9eb04ddb2 upgrade sgl kernel to 0.2.1 for main (#7676) 2025-07-01 00:00:13 -07:00
Yineng Zhang
392e441ad1 chore: upgrade flashinfer v0.2.7 jit (#7663) 2025-06-30 13:26:26 -07:00
Chunyuan WU
c5131f7a2f [CPU] add c++ kernel to bind CPU cores and memory node (#7524) 2025-06-29 19:45:25 -07:00
Xinyuan Tong
1b95162008 Updates transformers and timm dependencies (#7577)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
2025-06-27 00:30:17 -07:00
Keyang Ru
29bd4c8135 [CI] Add CI Testing for Prefill-Decode Disaggregation with Router (#7540) 2025-06-27 00:18:56 -07:00
Yineng Zhang
69183f8808 chore: bump v0.4.8.post1 (#7559) 2025-06-26 02:21:12 -07:00
Yineng Zhang
7c3a12c000 chore: bump v0.4.8 (#7493) 2025-06-23 23:14:22 -07:00
Lianmin Zheng
55e03b10c4 Fix a bug in BatchTokenIDOut & Misc style and dependency updates (#7457) 2025-06-23 06:20:39 -07:00
Stefan He
3774f07825 Multi-Stage Awake: Support Resume and Pause KV Cache and Weights separately (#7099) 2025-06-19 00:56:37 -07:00
Yineng Zhang
f9dc9dd28b chore: bump v0.4.7.post1 (#7248) 2025-06-16 15:20:29 -07:00
Lianmin Zheng
53a525bf33 [Eagle] Fix kernel call after updating speculative sampling kernels (#7231) 2025-06-16 07:25:59 -07:00
JieXin Liang
ed89837cf4 chore: upgrade sgl-kernel v0.1.8.post2 (#7186)
Co-authored-by: zhyncs <me@zhyncs.com>
2025-06-14 18:26:18 -07:00
fzyzcjy
bec3e48402 Support new DeepGEMM format in per token group quant (part 2: srt) (#7155) 2025-06-13 14:25:40 -07:00
Yineng Zhang
4f723edd3b chore: bump v0.4.7 (#7038) 2025-06-10 01:56:20 -07:00
yudian0504
81372f3bef Fix fused_moe triton configs (#7029) 2025-06-09 23:23:03 -07:00
Wenxuan Tan
a968c888c0 Fix torchvision version for Blackwell (#7015) 2025-06-09 15:50:19 -07:00
Yineng Zhang
56ccd3c22c chore: upgrade flashinfer v0.2.6.post1 jit (#6958)
Co-authored-by: alcanderian <alcanderian@gmail.com>
Co-authored-by: Qiaolin Yu <qy254@cornell.edu>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
Co-authored-by: Mick <mickjagger19@icloud.com>
Co-authored-by: ispobock <ispobaoke@gmail.com>
2025-06-09 09:22:39 -07:00
Yineng Zhang
23881fa60c chore: upgrade sgl-kernel v0.1.6.post1 (#6957) 2025-06-07 17:18:55 -07:00
JieXin Liang
6153f2ff6e chore: upgrade sgl-kernel v0.1.6 (#6945) 2025-06-07 02:53:26 -07:00
Zaili Wang
562f279a2d [CPU] enable CI for PRs, add Dockerfile and auto build task (#6458)
Co-authored-by: diwei sun <diwei.sun@intel.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2025-06-05 13:43:54 -07:00
Yineng Zhang
34c63731fc chore: upgrade sgl-kernel v0.1.5 (#6795) 2025-05-31 18:32:00 -07:00
Qiaolin Yu
7dc0e39442 Bump torch to 2.7.0 (#6788) 2025-05-31 14:43:12 -07:00
Yineng Zhang
7eb9d8e594 chore: upgrade transformers 4.52.3 (#6575)
Co-authored-by: Mick <mickjagger19@icloud.com>
2025-05-25 22:49:58 -07:00
Lifu Huang
022012aae8 Support Phi-4 Multi-Modal (text + vision only) (#6494) 2025-05-24 21:43:38 -07:00
Yineng Zhang
7e257cd666 chore: bump v0.4.6.post5 (#6566) 2025-05-24 00:48:05 -07:00
Yineng Zhang
0b07c4a99f chore: upgrade sgl-kernel v0.1.4 (#6532) 2025-05-22 13:28:16 -07:00
Trevor Morris
7adf245ba2 [Metrics] Add KV events publishing (#6098) 2025-05-19 14:19:54 -07:00
Yineng Zhang
f07c6a009b chore: upgrade sgl-kernel v0.1.3 (#6377) 2025-05-17 19:47:05 -07:00
Lianmin Zheng
4bb816d444 Fix CI tests (#6362) 2025-05-17 19:16:45 -07:00
Lianmin Zheng
dcc0a45618 Fix amd ci (#6360) 2025-05-16 15:33:10 -07:00
Baizhou Zhang
839fb31e5f [Fix] Improve dependencies for Blackwell image (#6334) 2025-05-16 12:38:22 -07:00
Lianmin Zheng
e07a6977e7 Minor improvements of TokenizerManager / health check (#6327) 2025-05-15 15:29:25 -07:00
Yineng Zhang
16267d4fa7 chore: bump v0.4.6.post4 (#6245) 2025-05-13 01:57:51 -07:00
Stefan He
1ab14c4c5c [VERL Use Case] Add torch_memory_saver into deps (#6247) 2025-05-12 19:09:03 -07:00
Lianmin Zheng
e8e18dcdcc Revert "fix some typos" (#6244) 2025-05-12 12:53:26 -07:00
applesaucethebun
d738ab52f8 fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-13 01:42:38 +08:00
Yineng Zhang
230106304d chore: upgrade sgl-kernel v0.1.2.post1 (#6196)
Co-authored-by: alcanderian <alcanderian@gmail.com>
2025-05-11 22:41:37 +08:00
applesaucethebun
2ce8793519 Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-11 12:55:00 +08:00
Yineng Zhang
678d8cc987 chore: bump v0.4.6.post3 (#6165) 2025-05-09 15:38:47 -07:00
Yixin Dong
911f3ba6f4 upgrade xgrammar to 0.1.19 (#6129) 2025-05-08 14:42:02 -07:00
JieXin Liang
f1ff736d68 [fix] fix pyproject.toml dependencies (#6119) 2025-05-08 02:14:36 -07:00
Song Zhang
00c2c1f08b [Feature] Support for Ascend NPU backend (#3853)
Signed-off-by: Song Zhang <gepin.zs@antgroup.com>
Co-authored-by: 22dimensions <waitingwind@foxmail.com>
2025-05-06 20:32:53 -07:00