Yineng Zhang
|
230106304d
|
chore: upgrade sgl-kernel v0.1.2.post1 (#6196)
Co-authored-by: alcanderian <alcanderian@gmail.com>
|
2025-05-11 22:41:37 +08:00 |
|
applesaucethebun
|
2ce8793519
|
Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-11 12:55:00 +08:00 |
|
Yineng Zhang
|
678d8cc987
|
chore: bump v0.4.6.post3 (#6165)
|
2025-05-09 15:38:47 -07:00 |
|
Yixin Dong
|
911f3ba6f4
|
upgrade xgrammar to 0.1.19 (#6129)
|
2025-05-08 14:42:02 -07:00 |
|
JieXin Liang
|
f1ff736d68
|
[fix] fix pyproject.toml dependencies (#6119)
|
2025-05-08 02:14:36 -07:00 |
|
Song Zhang
|
00c2c1f08b
|
[Feature] Support for Ascend NPU backend (#3853)
Signed-off-by: Song Zhang <gepin.zs@antgroup.com>
Co-authored-by: 22dimensions <waitingwind@foxmail.com>
|
2025-05-06 20:32:53 -07:00 |
|
Yineng Zhang
|
9858113c33
|
chore: bump v0.4.6.post2 (#5939)
|
2025-04-30 22:04:40 -07:00 |
|
Yineng Zhang
|
9a6ad8916d
|
chore: upgrade sgl-kernel 0.1.1 (#5933)
|
2025-04-30 16:13:30 -07:00 |
|
liwenju0
|
8fefdd32c7
|
[Feature] add support kimi vl model (#5383)
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
|
2025-04-29 21:31:19 -07:00 |
|
Baizhou Zhang
|
799789afed
|
Bump Flashinfer to 0.2.5 (#5870)
Co-authored-by: Yuhao Chen <yxckeis8@gmail.com>
|
2025-04-29 19:50:57 -07:00 |
|
Yineng Zhang
|
dcae1fb2cd
|
chore: bump v0.4.6.post1 (#5845)
|
2025-04-28 12:57:08 -07:00 |
|
Yineng Zhang
|
41ac0c6d48
|
chore: upgrade sgl-kernel 0.1.0 (#5690)
|
2025-04-27 21:00:50 -07:00 |
|
Baizhou Zhang
|
84022c0e56
|
Release v0.4.6 (#5795)
|
2025-04-27 14:07:05 -07:00 |
|
Michał Moskal
|
bdbe5f816b
|
update llguidance to 0.7.11; adds StructTag (#4870)
|
2025-04-26 20:13:57 -07:00 |
|
Connector Switch
|
70d040f904
|
[NFC] Remove duplicate compressed-tensors (#5640)
|
2025-04-22 09:10:25 -07:00 |
|
Yineng Zhang
|
b9c87e781d
|
chore: bump v0.4.5.post3 (#5611)
|
2025-04-21 18:16:20 -07:00 |
|
lukec
|
417b44eba8
|
[Feat] upgrade pytorch2.6 (#5417)
|
2025-04-20 16:06:34 -07:00 |
|
Lianmin Zheng
|
fbdc94ba59
|
Release v0.4.5.post2 (#5582)
|
2025-04-20 14:12:37 -07:00 |
|
Yineng Zhang
|
2c11f9c2eb
|
chore: upgrade sgl-kernel 0.0.9.post2 (#5540)
|
2025-04-18 21:17:23 -07:00 |
|
Yineng Zhang
|
5b5c7237c8
|
chore: bump v0.4.5.post1 (#5445)
|
2025-04-15 23:00:07 -07:00 |
|
Yineng Zhang
|
8ec0bb7d55
|
chore: upgrade sgl-kernel 0.0.9.post1 (#5436)
|
2025-04-15 15:45:51 -07:00 |
|
Yineng Zhang
|
8aab7fdb21
|
chore: upgrade sgl-kernel 0.0.9 (#5401)
|
2025-04-14 22:37:59 -07:00 |
|
Yineng Zhang
|
f58b929a51
|
chore: upgrade sgl-kernel 0.0.8.post3 (#5342)
|
2025-04-13 00:45:59 -07:00 |
|
Yineng Zhang
|
f774a0d275
|
feat: add blackwell Dockerfile (#5302)
|
2025-04-11 13:08:53 -07:00 |
|
Ke Bao
|
1078396f47
|
Update deps for mllama4 (#5215)
|
2025-04-10 09:12:44 -07:00 |
|
Yineng Zhang
|
57f99608f4
|
bump v0.4.5 (#5117)
|
2025-04-07 00:35:00 -07:00 |
|
Yineng Zhang
|
35e0856b90
|
bump v0.4.4.post4 (#5091)
|
2025-04-05 15:36:17 -07:00 |
|
Yi Zhang
|
aba5ca154d
|
python transfer custom allreduce from trt kernel to vllm kernel (#5080)
|
2025-04-05 15:35:55 -07:00 |
|
Yineng Zhang
|
0d99adb715
|
upgrade transformers 4.51.0 (#5088)
|
2025-04-05 14:20:23 -07:00 |
|
Yineng Zhang
|
e53bf190bc
|
upgrade sgl-kernel v0.0.7 (#5049)
|
2025-04-03 17:07:54 -07:00 |
|
Yineng Zhang
|
1c63e79756
|
use fa3 in sgl-kernel (#4954)
|
2025-03-31 16:14:49 -07:00 |
|
Lianmin Zheng
|
b26bc86b36
|
Support page size > 1 + eagle (#4908)
|
2025-03-30 00:46:23 -07:00 |
|
Yineng Zhang
|
19e96e5923
|
bump v0.4.4.post3 (#4878)
|
2025-03-28 23:21:24 -07:00 |
|
Yineng Zhang
|
d8a136a113
|
upgrade sgl-kernel 0.0.5.post4 (#4873)
|
2025-03-28 19:48:56 -07:00 |
|
Lianmin Zheng
|
74e0ac1dbd
|
Clean up import vllm in quantization/__init__.py (#4834)
|
2025-03-28 10:34:10 -07:00 |
|
fzyzcjy
|
d3f71f5e19
|
Fix torch.cuda.MemPool() internal assertion failure (#4687)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2025-03-27 22:29:36 -07:00 |
|
Junrong Lin
|
bb0fd749a6
|
[Fix] Add compressed_tensors as deps (#4819)
|
2025-03-27 18:08:24 -07:00 |
|
Yineng Zhang
|
bbab97a6a8
|
add partial_json_parser and einops (#4827)
|
2025-03-27 13:24:54 -07:00 |
|
Yineng Zhang
|
6f5cc5eb05
|
update xgrammar 0.1.17 (#4804)
|
2025-03-27 00:21:59 -07:00 |
|
Yineng Zhang
|
1099f6c974
|
bump v0.4.4.post2 (#4669)
|
2025-03-26 19:58:00 -07:00 |
|
Xiaoyu Zhang
|
04e3ff6975
|
Support compressed tensors fp8w8a8 (#4743)
|
2025-03-26 13:21:25 -07:00 |
|
fzyzcjy
|
26f07294f1
|
Warn users when release_memory_occupation is called without memory saver enabled (#4566)
|
2025-03-26 00:18:14 -07:00 |
|
Mick
|
1e86457c90
|
model: Minicpmo (#3023)
|
2025-03-24 20:08:40 -07:00 |
|
Yineng Zhang
|
c11cfda07b
|
update pyproject (#4731)
|
2025-03-24 09:50:28 -07:00 |
|
Yuhong Guo
|
64edeb798f
|
Support dynamic version name in sglang's pyproject.toml (#4720)
|
2025-03-24 08:56:31 -07:00 |
|
Adarsh Shirawalmath
|
f8f9244a61
|
[Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 (#3984)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-03-22 14:27:39 -07:00 |
|
Yineng Zhang
|
f81a27f65e
|
upgrade sgl-kernel 0.0.5.post3 (#4522)
|
2025-03-17 14:49:56 -07:00 |
|
mlmz
|
452db50808
|
Constraint Decoding: Set xgrammar as the default grammar backend (#4386)
|
2025-03-16 18:53:43 -07:00 |
|
Ying Sheng
|
1b859295f4
|
[Eagle] Remove the greedy branch and some redundant code (#4363)
Co-authored-by: Sehoon Kim <sehoon@x.ai>
|
2025-03-16 02:48:55 -07:00 |
|
Yineng Zhang
|
ad1ae7f7cd
|
use topk_softmax with sgl-kernel (#4439)
|
2025-03-14 15:59:06 -07:00 |
|