Yineng Zhang
|
f77da69964
|
chore: upgrade mooncake-transfer-engine (#6643)
|
2025-05-26 20:01:30 -07:00 |
|
Lianmin Zheng
|
e07a6977e7
|
Minor improvements of TokenizerManager / health check (#6327)
|
2025-05-15 15:29:25 -07:00 |
|
Stefan He
|
1ab14c4c5c
|
[VERL Use Case] Add torch_memory_saver into deps (#6247)
|
2025-05-12 19:09:03 -07:00 |
|
Yineng Zhang
|
f94543d22b
|
chore: add hf_xet dep (#6243)
|
2025-05-12 13:08:40 -07:00 |
|
shangmingc
|
0f334945c6
|
[CI] Fix PD mooncake dependency error (#6212)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-05-12 10:08:49 -07:00 |
|
Lianmin Zheng
|
03227c5fa6
|
[CI] Reorganize the 8 gpu tests (#6192)
|
2025-05-11 10:55:06 -07:00 |
|
Yineng Zhang
|
230106304d
|
chore: upgrade sgl-kernel v0.1.2.post1 (#6196)
Co-authored-by: alcanderian <alcanderian@gmail.com>
|
2025-05-11 22:41:37 +08:00 |
|
applesaucethebun
|
2ce8793519
|
Add typo checker in pre-commit (#6179)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-11 12:55:00 +08:00 |
|
Huapeng Zhou
|
b8559764f6
|
[Test] Add flashmla attention backend test (#5587)
|
2025-05-05 10:32:02 -07:00 |
|
Yineng Zhang
|
9a6ad8916d
|
chore: upgrade sgl-kernel 0.1.1 (#5933)
|
2025-04-30 16:13:30 -07:00 |
|
Yineng Zhang
|
41ac0c6d48
|
chore: upgrade sgl-kernel 0.1.0 (#5690)
|
2025-04-27 21:00:50 -07:00 |
|
Lianmin Zheng
|
3dd3538c18
|
Pin torch audio to 2.6.0 (#5750)
|
2025-04-25 15:06:28 -07:00 |
|
Ravi Theja
|
7d9679b74d
|
Add MMMU benchmark results (#4491)
Co-authored-by: Ravi Theja Desetty <ravitheja@Ravis-MacBook-Pro.local>
|
2025-04-25 15:23:53 +08:00 |
|
lukec
|
417b44eba8
|
[Feat] upgrade pytorch2.6 (#5417)
|
2025-04-20 16:06:34 -07:00 |
|
Yineng Zhang
|
0961feefca
|
feat: use flashinfer jit package (#5547)
|
2025-04-19 00:28:39 -07:00 |
|
Yineng Zhang
|
2c11f9c2eb
|
chore: upgrade sgl-kernel 0.0.9.post2 (#5540)
|
2025-04-18 21:17:23 -07:00 |
|
Yineng Zhang
|
8ec0bb7d55
|
chore: upgrade sgl-kernel 0.0.9.post1 (#5436)
|
2025-04-15 15:45:51 -07:00 |
|
Yineng Zhang
|
8aab7fdb21
|
chore: upgrade sgl-kernel 0.0.9 (#5401)
|
2025-04-14 22:37:59 -07:00 |
|
Yineng Zhang
|
f58b929a51
|
chore: upgrade sgl-kernel 0.0.8.post3 (#5342)
|
2025-04-13 00:45:59 -07:00 |
|
Yi Zhang
|
aba5ca154d
|
python transfer custom allreduce from trt kernel to vllm kernel (#5080)
|
2025-04-05 15:35:55 -07:00 |
|
Yineng Zhang
|
0d99adb715
|
upgrade transformers 4.51.0 (#5088)
|
2025-04-05 14:20:23 -07:00 |
|
Yineng Zhang
|
e53bf190bc
|
upgrade sgl-kernel v0.0.7 (#5049)
|
2025-04-03 17:07:54 -07:00 |
|
Xiaoyu Zhang
|
772d2a191d
|
try to fix ci oserror (#5024)
|
2025-04-03 02:45:05 -07:00 |
|
Yineng Zhang
|
1c63e79756
|
use fa3 in sgl-kernel (#4954)
|
2025-03-31 16:14:49 -07:00 |
|
Lianmin Zheng
|
b26bc86b36
|
Support page size > 1 + eagle (#4908)
|
2025-03-30 00:46:23 -07:00 |
|
Yineng Zhang
|
d8a136a113
|
upgrade sgl-kernel 0.0.5.post4 (#4873)
|
2025-03-28 19:48:56 -07:00 |
|
Lianmin Zheng
|
74e0ac1dbd
|
Clean up import vllm in quantization/__init__.py (#4834)
|
2025-03-28 10:34:10 -07:00 |
|
Xiaoyu Zhang
|
04e3ff6975
|
Support compressed tensors fp8w8a8 (#4743)
|
2025-03-26 13:21:25 -07:00 |
|
Adarsh Shirawalmath
|
f8f9244a61
|
[Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 (#3984)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-03-22 14:27:39 -07:00 |
|
Xiaoyu Zhang
|
dd865befde
|
[Hotfix] solve fp8 w8a8 ci test fail (#4531)
|
2025-03-17 23:17:04 -07:00 |
|
萝卜菜
|
d6d21640d3
|
[Feature] Support Deepseek-VL2 (#2798)
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: Yi Zhang <1109276519@qq.com>
|
2025-03-16 23:07:59 -07:00 |
|
Ying Sheng
|
1b859295f4
|
[Eagle] Remove the greedy branch and some redundant code (#4363)
Co-authored-by: Sehoon Kim <sehoon@x.ai>
|
2025-03-16 02:48:55 -07:00 |
|
Mick
|
035ac2ab74
|
ci: update transformers==4.48.3 (#4451)
|
2025-03-15 13:27:26 -07:00 |
|
Yineng Zhang
|
ad1ae7f7cd
|
use topk_softmax with sgl-kernel (#4439)
|
2025-03-14 15:59:06 -07:00 |
|
Lianmin Zheng
|
f141298a3c
|
Update ci_install_dependency.sh to use accelerate 1.4.0 (#4392)
Co-authored-by: wangyu <wangyu.steph@bytedance.com>
Co-authored-by: wangyu <yuwangauto@foxmail.com>
|
2025-03-13 07:16:11 -07:00 |
|
Yineng Zhang
|
3623b6a7f5
|
upgrade sgl-kernel 0.0.5 (#4381)
|
2025-03-13 02:37:56 -07:00 |
|
Yineng Zhang
|
ed91561f79
|
upgrade sgl-kernel 0.0.4.post3 (#4334)
|
2025-03-12 01:36:41 -07:00 |
|
Yineng Zhang
|
1cf63485c1
|
upgrade flashinfer 0.2.3 (#4317)
Co-authored-by: qingquansong <qsong@linkedin.com>
|
2025-03-11 15:37:17 -07:00 |
|
Yineng Zhang
|
4d27eb9ad1
|
update sgl-kernel 0.0.4.post2 (#4291)
|
2025-03-11 00:34:33 -07:00 |
|
Lianmin Zheng
|
5a6400eec5
|
Test no vllm custom allreduce (#4256)
|
2025-03-10 10:08:25 -07:00 |
|
Yineng Zhang
|
89ccb533ad
|
use sgl-kernel 0.0.4 (#4224)
|
2025-03-08 23:43:09 -08:00 |
|
Yineng Zhang
|
70866b6f4f
|
use same version for ci and pyproject (#4187)
|
2025-03-07 10:39:55 -08:00 |
|
Yineng Zhang
|
564bdf29f7
|
upgrade flashinfer v0.2.2.post1 (#3934)
|
2025-02-27 09:53:48 -08:00 |
|
Lianmin Zheng
|
c9745ee082
|
Fix pandas dependency in CI (#3818)
|
2025-02-24 05:56:57 -08:00 |
|
Yineng Zhang
|
75d171a9c5
|
chore: update flashinfer v0.2.1.post2 (#3644)
|
2025-02-18 02:47:42 +08:00 |
|
Shi Shuai
|
7443197a63
|
[CI] Improve Docs CI Efficiency (#3587)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-14 19:57:00 -08:00 |
|
Yineng Zhang
|
70f894b810
|
feat: support flashinfer mla attention for deepseek v3 (#3550)
|
2025-02-14 08:50:14 +08:00 |
|
Yineng Zhang
|
4d2dbeaca7
|
remove cutex dependency (#3422)
|
2025-02-09 18:33:20 +08:00 |
|
Yineng Zhang
|
d39899e85c
|
upgrade flashinfer v0.2.0.post2 (#3288)
Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>
|
2025-02-04 21:41:40 +08:00 |
|
Yineng Zhang
|
d06c1ab587
|
update ci install dependency (#2949)
|
2025-01-17 23:42:23 +08:00 |
|