Yineng Zhang
|
94fb4e9e54
|
feat: support fa cute in sgl-kernel (#10205)
Co-authored-by: cicirori <32845984+cicirori@users.noreply.github.com>
|
2025-09-09 00:14:39 -07:00 |
|
Swipe4057
|
bfd7a18d8d
|
update xgrammar 0.1.24 and transformers 4.56.1 (#10155)
|
2025-09-08 01:20:31 -07:00 |
|
jacky.cheng
|
efb0de2c8d
|
Update wave-lang to 3.7.0 and unify Wave kernel buffer options (#10069)
|
2025-09-05 16:01:52 -07:00 |
|
hlu1
|
2985090084
|
Update flashinfer to 0.3.1 for B300 support (#10087)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
|
2025-09-05 13:41:01 -07:00 |
|
Yineng Zhang
|
fa9c82d339
|
chore: bump v0.5.2rc2 (#10050)
|
2025-09-04 20:07:27 -07:00 |
|
Yineng Zhang
|
18f91eb639
|
chore: bump v0.5.2rc1 (#9920)
|
2025-09-02 04:43:34 -07:00 |
|
JieXin Liang
|
1db649ac02
|
[feat] apply deep_gemm compile_mode to skip launch (#9879)
|
2025-09-02 03:20:30 -07:00 |
|
Yineng Zhang
|
16e56ea693
|
chore: bump v0.5.2rc0 (#9862)
|
2025-09-01 03:07:36 -07:00 |
|
Yineng Zhang
|
349b491c63
|
chore: upgrade flashinfer 0.3.0 (#9864)
|
2025-09-01 03:07:19 -07:00 |
|
Yineng Zhang
|
300676afac
|
chore: upgrade transformers 4.56.0 (#9827)
|
2025-08-30 14:07:34 -07:00 |
|
Yineng Zhang
|
9970e3bf32
|
chore: upgrade sgl-kernel 0.3.7.post1 with deepgemm fix (#9822)
|
2025-08-30 04:02:25 -07:00 |
|
Yineng Zhang
|
3d8fc43400
|
chore: upgrade flashinfer 0.3.0rc1 (#9793)
|
2025-08-29 16:24:17 -07:00 |
|
Yineng Zhang
|
bc80dc4ce0
|
chore: bump v0.5.1.post3 (#9716)
|
2025-08-27 15:42:42 -07:00 |
|
Yineng Zhang
|
b962a296ed
|
chore: upgrade sgl-kernel 0.3.7 (#9708)
|
2025-08-27 14:00:31 -07:00 |
|
Liangsheng Yin
|
0ff7241995
|
Improve bench_one_batch_server script (#9608)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-08-26 10:38:37 +08:00 |
|
Yineng Zhang
|
e3e97a120b
|
chore: bump v0.5.1.post2 (#9592)
|
2025-08-25 03:45:09 -07:00 |
|
Yineng Zhang
|
938e986e15
|
chore: upgrade flashinfer 0.2.14.post1 (#9578)
|
2025-08-25 00:12:17 -07:00 |
|
Yineng Zhang
|
e0ab167db0
|
chore: bump v0.5.1.post1 (#9558)
|
2025-08-24 01:14:17 -07:00 |
|
Lianmin Zheng
|
97a38ee85b
|
Release 0.5.1 (#9533)
|
2025-08-23 07:09:26 -07:00 |
|
Lianmin Zheng
|
f20b6a3f2b
|
[minor] Sync style changes (#9376)
|
2025-08-19 21:35:01 -07:00 |
|
Swipe4057
|
6805f6da40
|
upgrade xgrammar 0.1.23 and openai-harmony 0.0.4 (#9284)
|
2025-08-18 14:02:00 -07:00 |
|
Lianmin Zheng
|
c480a3f6ea
|
Minor style fixes for sgl-kernel (#9289)
|
2025-08-18 09:38:35 -07:00 |
|
Yineng Zhang
|
fab0f6e77d
|
chore: bump v0.5.0rc2 (#9203)
|
2025-08-14 16:11:16 -07:00 |
|
Yineng Zhang
|
ac474869d4
|
chore: upgrade transformers 4.55.2 (#9197)
|
2025-08-14 13:51:02 -07:00 |
|
Hongbo Xu
|
2cc9eeab01
|
[4/n]decouple quantization implementation from vLLM dependency (#9191)
Co-authored-by: AniZpZ <aniz1905@gmail.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-08-14 12:05:46 -07:00 |
|
eigen
|
4dbf43601d
|
fix: zero_init buffer (#9065)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-08-14 02:39:09 -07:00 |
|
Yineng Zhang
|
7b56e494be
|
chore: bump v0.5.0rc1 (#9069)
|
2025-08-13 10:44:14 -07:00 |
|
Ke Bao
|
0ff6d1fce1
|
Support FA3 backend for gpt-oss (#9028)
|
2025-08-13 10:41:50 -07:00 |
|
jacky.cheng
|
25caa7a8a9
|
[AMD] Support Wave attention backend with AMD GPU optimizations (#8660)
Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Signed-off-by: Harsh Menon <harsh@nod-labs.com>
Signed-off-by: nithinsubbiah <nithinsubbiah@gmail.com>
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
Signed-off-by: xintin <gaurav.verma@amd.com>
Co-authored-by: Harsh Menon <harsh@nod-labs.com>
Co-authored-by: Stanley Winata <stanley.winata@amd.com>
Co-authored-by: Stanley Winata <68087699+raikonenfnu@users.noreply.github.com>
Co-authored-by: Stanley Winata <stanley@nod-labs.com>
Co-authored-by: Ivan Butygin <ivan.butygin@gmail.com>
Co-authored-by: nithinsubbiah <nithinsubbiah@gmail.com>
Co-authored-by: Nithin Meganathan <18070964+nithinsubbiah@users.noreply.github.com>
Co-authored-by: Ivan Butygin <ibutygin@amd.com>
|
2025-08-12 13:49:11 -07:00 |
|
Jiaqi Gu
|
c9ee738515
|
Fuse writing KV buffer into rope kernel (part 2: srt) (#9014)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
|
2025-08-12 13:15:30 -07:00 |
|
zhyncs
|
f4ae50e97c
|
fix: use flashinfer v0.2.11.post1
|
2025-08-11 02:49:25 -07:00 |
|
Yineng Zhang
|
84cb449eec
|
Revert "chore: upgrade flashinfer 0.2.11 (#9036)" (#9057)
|
2025-08-11 00:16:39 -07:00 |
|
Yineng Zhang
|
dd001a5477
|
chore: upgrade flashinfer 0.2.11 (#9036)
|
2025-08-10 17:35:37 -07:00 |
|
Lianmin Zheng
|
b58ae7a2a0
|
Simplify frontend language (#9029)
|
2025-08-10 10:59:30 -07:00 |
|
Yineng Zhang
|
326a901df4
|
chore: upgrade sgl-kernel 0.3.3 (#8998)
|
2025-08-09 01:22:01 -07:00 |
|
ishandhanani
|
de8b8b6e5c
|
chore(deps): update minimum python to 3.10 (#8984)
|
2025-08-09 00:30:23 -07:00 |
|
Lianmin Zheng
|
706bd69cc5
|
Clean up server_args.py to have a dedicated function for model specific adjustments (#8983)
|
2025-08-08 19:56:50 -07:00 |
|
Lianmin Zheng
|
67a7d1f699
|
Create cancel-all-pr-test-runs (#8986)
|
2025-08-08 15:53:51 -07:00 |
|
Yineng Zhang
|
9020f7fc32
|
chore: bump v0.5.0rc0 (#8959)
|
2025-08-08 09:16:18 -07:00 |
|
Yineng Zhang
|
4bf6e5a6b0
|
fix: use openai 1.99.1 (#8927)
|
2025-08-07 14:20:35 -07:00 |
|
Chang Su
|
92cc32d9fc
|
Support v1/responses and use harmony in serving_chat (#8837)
Signed-off-by: Xinyuan Tong <justinning0323@outlook.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Co-authored-by: Xinyuan Tong <justinning0323@outlook.com>
Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
2025-08-06 16:20:34 -07:00 |
|
Yineng Zhang
|
3ae8e3ea8f
|
chore: upgrade torch 2.8.0 (#8836)
|
2025-08-05 17:32:01 -07:00 |
|
Yineng Zhang
|
4f4e0e4162
|
chore: upgrade flashinfer 0.2.10 (#8827)
|
2025-08-05 12:04:01 -07:00 |
|
Yineng Zhang
|
901ab758ec
|
chore: upgrade transformers 4.55.0 (#8823)
Co-authored-by: hebiao064 <hebiaobuaa@gmail.com>
|
2025-08-05 11:37:21 -07:00 |
|
Yineng Zhang
|
1ea94d3b92
|
chore: upgrade flashinfer v0.2.9 (#8780)
|
2025-08-04 21:59:18 -07:00 |
|
Yineng Zhang
|
8cd344586e
|
chore: bump v0.4.10.post2 (#8727)
|
2025-08-03 03:43:29 -07:00 |
|
fzyzcjy
|
0e612dbf12
|
Tiny fix CI pytest error (#8524)
|
2025-08-02 22:48:42 -07:00 |
|
Swipe4057
|
5deab1283a
|
upgrade xgrammar 0.1.22 (#8522)
|
2025-08-01 15:59:15 -07:00 |
|
pansicheng
|
20b5563eda
|
Add hf3fs_utils.cpp to package-data (#8653)
|
2025-08-01 12:41:09 +08:00 |
|
Ke Bao
|
33f0de337d
|
chore: bump v0.4.10.post1 (#8652)
|
2025-08-01 12:07:30 +08:00 |
|