Commit Graph

142 Commits

Author SHA1 Message Date
Lianmin Zheng
03d5fbfd44 Release 0.4.1.post3 - upload the config.json to PyPI (#2647) 2024-12-29 14:25:53 -08:00
Yineng Zhang
3ccf566b0d chore: bump v0.4.1.post2 (#2643) 2024-12-30 00:11:46 +08:00
Yineng Zhang
ef5b0ff90b chore: bump v0.4.1.post1 (#2616) 2024-12-28 00:11:06 +08:00
HandH1998
6e5305158c update sgl_moe_align_block_size usage (#2617) 2024-12-28 00:01:13 +08:00
yudian0504
531d6ea968 fix: package data missing (#2521) 2024-12-26 08:16:48 -08:00
Yineng Zhang
635a042623 docs: update deepseek v3 example (#2592) 2024-12-26 17:43:37 +08:00
Yineng Zhang
efc52f85e2 chore: bump v0.4.1 (#2582) 2024-12-26 07:14:51 +08:00
Yineng Zhang
60e2fdcf4f use sgl-kernel moe_align_block_size (#2581)
Co-authored-by: ispobock <ispobaoke@163.com>
Co-authored-by: HandH1998 <1335248067@qq.com>
2024-12-26 06:29:08 +08:00
Yineng Zhang
8f4d04e540 chore: bump v0.4.0.post2 (#2525) 2024-12-21 21:16:34 +08:00
Jerry Zhang
feb2b768ba Add integration with gemlite weight only quant (#2528) 2024-12-21 00:25:25 +08:00
Yineng Zhang
4b83db24f1 fix: continue to use flashinfer 0.1.6 temporarily (#2517) 2024-12-19 14:03:24 +08:00
Yineng Zhang
626a99ac13 chore: update ao v0.7.0 (#2447) 2024-12-11 04:44:28 -08:00
Lianmin Zheng
641b7d0ae0 [Minor] Improve code style (#2422) 2024-12-09 06:30:35 -08:00
SangBin Cho
1f09e84b9a nit: Remove busy waiting on scheduler (#2382) 2024-12-08 01:06:15 -08:00
Yineng Zhang
aaac33fd8d fix: update xgrammar v0.1.6 (#2390) 2024-12-07 21:09:16 +08:00
Lianmin Zheng
e5f227c0ee Release v0.4.0.post1 (#2375) 2024-12-06 06:08:19 -08:00
Yineng Zhang
2db4469808 minor: limit the range of vllm versions (#2350) 2024-12-05 02:00:34 +08:00
Yineng Zhang
f8b0326934 chore: bump v0.4.0 (#2338) 2024-12-03 11:55:41 -08:00
Yineng Zhang
fae4e5e99a chore: bump v0.3.6.post3 (#2259) 2024-11-30 01:41:16 +08:00
Lianmin Zheng
fed4c6946a Release v0.3.6.post2 (#2214)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-11-27 03:35:30 -08:00
Yineng Zhang
bc1f6fda0d fix: add cuda-python for xgrammar (#2199) 2024-11-26 17:24:18 +08:00
Lianmin Zheng
ac5a0f0488 Release v0.3.6.post1 (#2189) 2024-11-25 17:31:37 -08:00
Lianmin Zheng
1605ae121e [CI] Minor fix for CI (#2187) 2024-11-25 16:38:43 -08:00
Yixin Dong
7f076c2ce6 Update XGrammar to the latest API (#2176)
Co-authored-by: Ben Gitter <gitterbd@gmail.com>
2024-11-25 15:58:30 -08:00
Ankur Neog
865233e256 Add initial support for intel Gaudi accelerators (#2121) 2024-11-22 20:22:23 -08:00
Yineng Zhang
2797bc3422 fix: add xgrammar dependency (#2126) 2024-11-22 20:53:11 +08:00
Yineng Zhang
9a00e6f453 chore: bump v0.3.6 (#2120) 2024-11-22 19:27:30 +08:00
Lianmin Zheng
dfec7fca06 Rename sglang.bench_latency to sglang.bench_one_batch (#2118) 2024-11-21 20:07:48 -08:00
Yineng Zhang
766192610e feat: update torch 2.5.1 (#2069) 2024-11-18 21:29:13 +08:00
Lianmin Zheng
c1f401fc58 Revert "chore: update torch v2.5.1" (#2063) 2024-11-17 15:29:38 -08:00
Yineng Zhang
3b878863f7 chore: update torch v2.5.1 (#1849) 2024-11-18 00:06:00 +08:00
Lianmin Zheng
32c9a7ec11 Release v0.3.5.post2 (#2046) 2024-11-15 06:54:00 -08:00
Lianmin Zheng
a10d530943 Fix outlines version (#2036) 2024-11-14 12:52:40 -08:00
Lianmin Zheng
f407fcf9ef Release v0.3.5.post1 (#2022) 2024-11-13 10:27:12 -08:00
Yineng Zhang
b3523af8eb fix: update pyzmq version (#1983) 2024-11-10 21:33:23 +08:00
Huanzhi (Hans) Mao
ed53ac84b4 Specify zmq Version Requirement (#1982) 2024-11-10 01:32:07 -08:00
Yudi Xue
95a4ed129a Fix metrics (#1963) 2024-11-08 23:21:11 -08:00
Lianmin Zheng
a509552087 [minor] Improve code style and compatibility (#1961) 2024-11-08 02:19:41 -08:00
Lianmin Zheng
65859754f1 Release v0.3.5 (#1908) 2024-11-03 13:48:11 -08:00
HAI
d8e9d61f86 [Build, ROCm] Dockerfile.rocm for Instinct GPUs, with package updates (#1861) 2024-10-31 16:38:16 -07:00
Chayenne
539df95d2c Imporve openai api documents (#1827)
Co-authored-by: Chayenne <zhaochenyang@g.ucla.edu>
2024-10-30 00:39:41 -07:00
Lianmin Zheng
30643fed7f Release v0.3.4.post2 (#1796)
Co-authored-by: DarkSharpness <76582120+DarkSharpness@users.noreply.github.com>
2024-10-25 11:07:19 -07:00
Lianmin Zheng
1f26e8b8e4 Release v0.3.4.post1 (#1749) 2024-10-21 21:16:43 -07:00
Liangsheng Yin
94cde10920 Llama3.2 vision model support (#1551) 2024-10-21 15:01:21 -07:00
Yineng Zhang
8bee20f80b Update vllm to 0.6.3 (#1711) (#1720)
Co-authored-by: Ke Bao <ISPObaoke@163.com>
2024-10-19 20:45:41 -07:00
Lianmin Zheng
087257ea03 Release v0.3.4 (#1714) 2024-10-19 08:17:41 -07:00
Michael Feil
b0facb3316 add orjson for jsonresponse (#1688) 2024-10-16 18:14:30 -07:00
Ke Bao
d10b933a36 Fix srt dependency (#1685) 2024-10-16 08:21:20 -07:00
Zhang, Liangang
5d638c92f5 [Feature, Hardware] Enable SGLang on XPU GPUs via PyTorch (#1480) 2024-10-12 18:10:32 +00:00
Lianmin Zheng
00c7e6368b Release v0.3.3.post1 (#1636) 2024-10-11 07:56:16 -07:00