Yineng Zhang
|
ba80c102f9
|
bump v0.4.4.post1 (#4402)
|
2025-03-13 17:53:46 -07:00 |
|
Yineng Zhang
|
6aaeb84872
|
chore: bump v0.4.4 (#4041)
|
2025-03-13 02:49:58 -07:00 |
|
Yineng Zhang
|
3623b6a7f5
|
upgrade sgl-kernel 0.0.5 (#4381)
|
2025-03-13 02:37:56 -07:00 |
|
Lianmin Zheng
|
45de89719c
|
Revert "[XPU][CPU] Enable the native path of DeepSeek" (#4367)
|
2025-03-12 23:45:52 -07:00 |
|
Meng, Hengyu
|
71046fcd71
|
[XPU][CPU] Enable the native path of DeepSeek (#4086)
Co-authored-by: Zhang, Liangang <liangang.zhang@intel.com>
|
2025-03-12 22:26:29 -07:00 |
|
YR Chen
|
ccdd10c84b
|
Move aiohttp into public dependencies (#3980)
|
2025-03-12 21:42:57 -07:00 |
|
Yineng Zhang
|
ed91561f79
|
upgrade sgl-kernel 0.0.4.post3 (#4334)
|
2025-03-12 01:36:41 -07:00 |
|
Yineng Zhang
|
1cf63485c1
|
upgrade flashinfer 0.2.3 (#4317)
Co-authored-by: qingquansong <qsong@linkedin.com>
|
2025-03-11 15:37:17 -07:00 |
|
yigex
|
690e1f2371
|
[AMD] Fix rocm sgl-kernel missing modules error (#4311)
Co-authored-by: yiakwy-xpu-ml-framework-team <leiwang2@amd.com>
|
2025-03-11 10:35:28 -07:00 |
|
Yineng Zhang
|
4d27eb9ad1
|
update sgl-kernel 0.0.4.post2 (#4291)
|
2025-03-11 00:34:33 -07:00 |
|
Yineng Zhang
|
e187a3d595
|
upgrade xgrammar 0.1.15 (#4275)
|
2025-03-10 14:53:24 -07:00 |
|
Lianmin Zheng
|
5a6400eec5
|
Test no vllm custom allreduce (#4256)
|
2025-03-10 10:08:25 -07:00 |
|
Yineng Zhang
|
89ccb533ad
|
use sgl-kernel 0.0.4 (#4224)
|
2025-03-08 23:43:09 -08:00 |
|
Lianmin Zheng
|
d4017a6b63
|
[EAGLE] many fixes for eagle (#4195)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Sehoon Kim <sehoon@x.ai>
|
2025-03-07 22:12:13 -08:00 |
|
Lianmin Zheng
|
9c58e68b4c
|
Release v0.4.3.post4 (#4140)
|
2025-03-06 12:50:28 -08:00 |
|
Oliver Stanley
|
d03b3467b8
|
Fix constrained generation errors by adding datasets dependency (#4142)
|
2025-03-06 12:07:51 -08:00 |
|
Yineng Zhang
|
fc671f66c1
|
chore: bump v0.4.3.post3 (#4114)
|
2025-03-05 17:26:10 -08:00 |
|
Lianmin Zheng
|
e074d84e5b
|
[Minor] more code cleanup (#4077)
|
2025-03-04 21:23:47 -08:00 |
|
HAI
|
51d25405a7
|
ROCm: update aiter and its usage to fused moe (bloat16, fp8, fp8 block-quant) (#4053)
|
2025-03-04 03:00:46 -08:00 |
|
Lianmin Zheng
|
935cda944b
|
Misc clean up; Remove the support of jump forward (#4032)
|
2025-03-03 07:02:14 -08:00 |
|
Lianmin Zheng
|
ac2387279e
|
Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
|
2025-03-03 00:12:04 -08:00 |
|
Yineng Zhang
|
f3b99f73b3
|
update flashinfer-python version
|
2025-02-28 16:31:59 -08:00 |
|
Chayenne
|
90bc26a813
|
set a strict sgl-kernel version (#3950)
|
2025-02-27 22:44:57 -08:00 |
|
Yineng Zhang
|
564bdf29f7
|
upgrade flashinfer v0.2.2.post1 (#3934)
|
2025-02-27 09:53:48 -08:00 |
|
Enrique Shockwave
|
d281587989
|
Improve: Support xgrammar 0.1.14 (#3593)
|
2025-02-27 08:42:54 -08:00 |
|
JC1DA
|
7551498a69
|
[Feature] Support llguidance for constrained decoding (#3298)
|
2025-02-26 10:41:49 -08:00 |
|
Lianmin Zheng
|
27a46317b6
|
Fix dependency (#3813)
|
2025-02-24 03:50:58 -08:00 |
|
Yineng Zhang
|
058d199d4e
|
use transformers 4.48.3 (#3650)
|
2025-02-18 04:40:47 +08:00 |
|
Yineng Zhang
|
a5375adc3a
|
chore: bump v0.4.3.post2 (#3645)
Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>
|
2025-02-18 02:48:30 +08:00 |
|
Yineng Zhang
|
75d171a9c5
|
chore: update flashinfer v0.2.1.post2 (#3644)
|
2025-02-18 02:47:42 +08:00 |
|
Yineng Zhang
|
e782eb7e6a
|
chore: bump v0.4.3.post1 (#3638)
|
2025-02-17 21:58:19 +08:00 |
|
Yineng Zhang
|
bbc47c348f
|
fix apply_token_bitmask_inplace_cuda (#3594)
|
2025-02-15 23:55:08 +08:00 |
|
Yineng Zhang
|
e0b9a423c8
|
chore: bump v0.4.3 (#3556)
|
2025-02-14 09:43:14 +08:00 |
|
Yineng Zhang
|
70f894b810
|
feat: support flashinfer mla attention for deepseek v3 (#3550)
|
2025-02-14 08:50:14 +08:00 |
|
yizhang2077
|
98eecbda54
|
integrate blockwise fp8 kernel (#3529)
|
2025-02-13 04:39:33 +08:00 |
|
Xiaoyu Zhang
|
45e3a7bc41
|
use sgl_per_token_group_quant_fp8 kernel (#3493)
|
2025-02-12 18:40:42 +08:00 |
|
Yineng Zhang
|
cddb1cdf8f
|
chore: bump v0.4.2.post4 (#3459)
|
2025-02-10 14:12:16 +08:00 |
|
Yineng Zhang
|
85986bb978
|
compatible with new outlines (#3435)
|
2025-02-10 01:51:30 +08:00 |
|
Yineng Zhang
|
c1f5f99f60
|
chore: bump v0.4.2.post3 (#3369)
|
2025-02-07 08:20:03 -08:00 |
|
Yineng Zhang
|
f287037673
|
update sgl-kernel version (#3374)
|
2025-02-07 20:51:06 +08:00 |
|
Yineng Zhang
|
7aad8d1854
|
chore: bump v0.4.2.post2 (#3313)
|
2025-02-05 17:35:02 +08:00 |
|
HAI
|
2c1a695ff1
|
ROCm: sgl-kernel enablement starting with sgl_moe_align_block (#3287)
|
2025-02-04 21:44:44 +08:00 |
|
Yineng Zhang
|
d39899e85c
|
upgrade flashinfer v0.2.0.post2 (#3288)
Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>
|
2025-02-04 21:41:40 +08:00 |
|
HAI
|
566d61d90f
|
ROCm: bump 6.3.0 (#3259)
|
2025-02-03 04:13:40 +08:00 |
|
Yineng Zhang
|
34e405e01f
|
update sgl-kernel version for sglang (#3238)
|
2025-02-01 02:14:41 +08:00 |
|
Yineng Zhang
|
cf0f7eafe6
|
chore: bump v0.4.2.post1 (#3233)
|
2025-01-31 20:35:55 +08:00 |
|
Yineng Zhang
|
4ab43cfb3e
|
chore: bump v0.4.2 (#3180)
|
2025-01-27 21:42:05 +08:00 |
|
Yineng Zhang
|
2f79f58873
|
feat: use sgl-kernel 0.0.3 in sglang (#3179)
|
2025-01-27 21:39:52 +08:00 |
|
Yineng Zhang
|
7e0976133c
|
udpate sgl-kernel version for srt (#3150)
|
2025-01-26 20:22:34 +08:00 |
|
Yineng Zhang
|
e94fb7cb10
|
chore: bump v0.4.1.post7 (#3009)
|
2025-01-20 21:50:55 +08:00 |
|