Yineng Zhang
|
c38b5fb4f4
|
update 3rdparty and rms norm for sgl-kernel (#3213)
|
2025-01-30 19:32:21 +08:00 |
|
Yineng Zhang
|
8a96f74988
|
chore: bump 0.0.3 for sgl-kernel (#3178)
Co-authored-by: ispobock <ispobaoke@hotmail.com>
Co-authored-by: BBuf <35585791+BBuf@users.noreply.github.com>
Co-authored-by: HandH1998 <007aabbcc411@gmail.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
Co-authored-by: ByronHsu <byronhsu1230@gmail.com>
|
2025-01-27 20:29:28 +08:00 |
|
Byron Hsu
|
514f37c32b
|
[kernel] Fix position ids in rope (#3173)
|
2025-01-27 17:09:51 +08:00 |
|
Byron Hsu
|
741fccd7bf
|
Bump sgl kernel to 0.0.2.post19 (#3167)
|
2025-01-27 15:36:07 +08:00 |
|
Yineng Zhang
|
318260c0fa
|
chore: bump 0.0.2.post18 for sgl-kernel (#3149)
|
2025-01-26 19:00:34 +08:00 |
|
Yineng Zhang
|
896c07441e
|
update installation doc for sgl-kernel (#3129)
|
2025-01-26 00:00:13 +08:00 |
|
Yineng Zhang
|
14e754a868
|
chore: bump v0.0.2.post17 for sgl-kernel (#3125)
|
2025-01-25 20:43:02 +08:00 |
|
Yineng Zhang
|
54bac8af0b
|
chore: bump sgl-kernel 0.0.2.post16 (#3087)
|
2025-01-24 01:57:48 +08:00 |
|
Yineng Zhang
|
1f6cf0d4b9
|
fix build error for sgl-kernel (#3078)
|
2025-01-23 19:16:35 +08:00 |
|
Lianmin Zheng
|
553f5a3ffe
|
Remove torch dependency in sgl-kernel (#3074)
|
2025-01-23 17:23:37 +08:00 |
|
Byron Hsu
|
b5caa22dfb
|
[kernel] port rope cuda kernel to sgl-kernel (#2993)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2025-01-20 20:58:51 +08:00 |
|
Yineng Zhang
|
a53454c55e
|
fix: sgl-kernel link cuda (#2906)
|
2025-01-16 04:53:23 +08:00 |
|
yizhang2077
|
6cb3974e77
|
optimize custom allreduce kernel (#2904)
|
2025-01-16 03:04:25 +08:00 |
|
Xiaoyu Zhang
|
e2b16c4716
|
add sampling_scaling_penalties kernel (#2846)
|
2025-01-12 19:38:17 -08:00 |
|
yizhang2077
|
3900a94afe
|
Support twoshot kernel (#2688)
|
2025-01-06 00:47:16 +08:00 |
|
HandH1998
|
77d1210b36
|
fix moe_align_block_size (#2615)
|
2024-12-27 23:32:53 +08:00 |
|
Yineng Zhang
|
2dccecf432
|
fix: only enable moe_align_block_size for now (#2590)
|
2024-12-26 16:56:59 +08:00 |
|
Yineng Zhang
|
d7c0e872b0
|
chore: bump 0.0.2.post8 for sgl-kernel (#2580)
|
2024-12-26 06:11:39 +08:00 |
|
Yineng Zhang
|
31548116a8
|
fix moe_align_block_size_kernel for shared memory issue (#2579)
Co-authored-by: ispobock <ispobaoke@163.com>
|
2024-12-26 05:31:04 +08:00 |
|
yizhang2077
|
e04d3f2897
|
adapt tensorrt llm custom all reduce to sgl-kernel (#2481)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-12-15 13:15:59 +08:00 |
|
Yineng Zhang
|
2673fa29d4
|
fix: set runtime path (#2466)
|
2024-12-12 18:05:48 +08:00 |
|
Yineng Zhang
|
56fcd8e8a5
|
feat: support sgl-kernel PyPI (#2433)
Co-authored-by: Zhangyi <1109276519@qq.com>
|
2024-12-11 06:06:19 +08:00 |
|
Yineng Zhang
|
47eb139f81
|
feat: use warp reduce as a simple example (#2304)
|
2024-12-01 22:43:50 +08:00 |
|
Yineng Zhang
|
5c91a315d7
|
feat: support sgl-kernel pypi (#2302)
|
2024-12-01 20:11:21 +08:00 |
|