Commit Graph

51 Commits

Author SHA1 Message Date
Yineng Zhang
7e257cd666 chore: bump v0.4.6.post5 (#6566) 2025-05-24 00:48:05 -07:00
Yineng Zhang
16267d4fa7 chore: bump v0.4.6.post4 (#6245) 2025-05-13 01:57:51 -07:00
Lianmin Zheng
e8e18dcdcc Revert "fix some typos" (#6244) 2025-05-12 12:53:26 -07:00
applesaucethebun
d738ab52f8 fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
2025-05-13 01:42:38 +08:00
Yineng Zhang
678d8cc987 chore: bump v0.4.6.post3 (#6165) 2025-05-09 15:38:47 -07:00
Yineng Zhang
9858113c33 chore: bump v0.4.6.post2 (#5939) 2025-04-30 22:04:40 -07:00
Yineng Zhang
dcae1fb2cd chore: bump v0.4.6.post1 (#5845) 2025-04-28 12:57:08 -07:00
Baizhou Zhang
84022c0e56 Release v0.4.6 (#5795) 2025-04-27 14:07:05 -07:00
Yineng Zhang
b9c87e781d chore: bump v0.4.5.post3 (#5611) 2025-04-21 18:16:20 -07:00
lukec
417b44eba8 [Feat] upgrade pytorch2.6 (#5417) 2025-04-20 16:06:34 -07:00
AniZpZ
d95269f9b3 [2/3] fix dsv3 awq issue (#4625)
Co-authored-by: 晟海 <huangtingwei.htw@antgroup.com>
Co-authored-by: laixinn <xielx@shanghaitech.edu.cn>
2025-04-03 17:36:39 -07:00
Wenbo Yang
75b656488a Support serving DeepSeek-R1-Channel-INT8 with 32 L40S. (#4418) 2025-03-17 00:03:43 -07:00
Zhan Lu
660305c38a [Doc] fix wrong flag in deepseek documentation (#4427) 2025-03-14 11:30:55 -07:00
laixin
0c02086015 add INT8 example into dsv3 README (#4079) 2025-03-12 21:37:30 -07:00
lukec
ffa1b3e318 Add an example of using deepseekv3 int8 sglang. (#4177)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
2025-03-07 01:56:09 -08:00
Yineng Zhang
5d86016855 revert "Docs: Reorngaize dpsk links #3900" (#3933) 2025-02-27 08:57:13 -08:00
Chayenne
7c1692aa90 Docs: Reorngaize dpsk links (#3900) 2025-02-26 15:16:31 -08:00
Zhanghao Wu
f93e915817 [Docs] Add SkyPilot DeepSeek example (#3706) 2025-02-20 02:10:23 +08:00
Yineng Zhang
fe0673f1cc set NCCL_IB_GID_INDEX=3 for multi node NVIDIA InfiniBand if needed (#3698) 2025-02-19 20:50:22 +08:00
Shenggui Li
c9565e49e7 [docker] added rdma support (#3619) 2025-02-17 15:36:16 +08:00
Yineng Zhang
ac963be234 update flashinfer-python (#3557) 2025-02-14 09:52:56 +08:00
Yineng Zhang
e0b9a423c8 chore: bump v0.4.3 (#3556) 2025-02-14 09:43:14 +08:00
Yineng Zhang
20de05a753 update README (#3543) 2025-02-13 17:22:11 +08:00
Jhin
bf2a70872e Update DeepSeek V3 Doc (#3541) 2025-02-12 23:15:37 -08:00
Xiaoyu Zhang
693c2600e0 refine deepseek_v3 launch server doc (#3522) 2025-02-12 17:27:07 +08:00
Yineng Zhang
cddb1cdf8f chore: bump v0.4.2.post4 (#3459) 2025-02-10 14:12:16 +08:00
Yineng Zhang
f90db8bc07 fix typo 2025-02-08 22:16:42 +08:00
Ke Bao
d8ad597048 Add deepseek-v3 a100 serving example (#3404) 2025-02-08 22:13:52 +08:00
Yineng Zhang
c1f5f99f60 chore: bump v0.4.2.post3 (#3369) 2025-02-07 08:20:03 -08:00
Ke Bao
6792411e7f [Doc] Add optimization option guide for deepseek v3 (#3349) 2025-02-06 23:28:09 +08:00
Yineng Zhang
7348d9627e add AMD guide for DeepSeek-R1 (#3338) 2025-02-06 16:54:40 +08:00
Yineng Zhang
07e58a2dcb update README (#3324) 2025-02-06 07:13:05 +08:00
Yineng Zhang
80002562a8 docs: update README (#2878) 2025-01-14 12:48:17 +08:00
Yineng Zhang
41d7e5b7e6 docs: update link (#2857) 2025-01-13 18:40:48 +08:00
Lianmin Zheng
72c7776355 Fix linear.py and improve weight loading (#2851)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2025-01-13 01:39:14 -08:00
Yineng Zhang
197cbf9bab docs: update README (#2841) 2025-01-11 23:11:38 +08:00
Yineng Zhang
f624901cdd chore: bump v0.4.1.post5 (#2840) 2025-01-11 23:10:02 +08:00
Rodrigo Garcia
a990daff9c Included multi-node DeepSeekv3 example (#2707) 2025-01-02 22:17:03 +08:00
Lianmin Zheng
ad20b7957e Eagle speculative decoding part 3: small modifications to the general scheduler (#2709)
Co-authored-by: kavioyu <kavioyu@tencent.com>
2025-01-02 02:09:08 -08:00
Lianmin Zheng
8c3b420eec [Docs] clean up structured outputs docs (#2654) 2024-12-29 23:57:16 -08:00
Yineng Zhang
098d659c0e docs: update README (#2651) 2024-12-30 13:33:29 +08:00
Lzhang-hub
76d14f8cb9 add 2*h20 node serving example for deepseek v3 (#2650)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-12-30 13:04:38 +08:00
Lianmin Zheng
03d5fbfd44 Release 0.4.1.post3 - upload the config.json to PyPI (#2647) 2024-12-29 14:25:53 -08:00
Yineng Zhang
763dd55d17 docs: update README (#2644) 2024-12-30 01:24:06 +08:00
Ke Bao
8a2681e26a Update readme (#2625) 2024-12-28 13:39:56 +08:00
Yineng Zhang
d9e6ee382b docs: update README (#2618) 2024-12-28 00:21:53 +08:00
Lianmin Zheng
f46f394f4d Update README.md (#2605) 2024-12-26 10:58:49 -08:00
Lianmin Zheng
773951548d Fix logprob_start_len for multi modal models (#2597)
Co-authored-by: libra <lihu723@gmail.com>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: Wang, Haoyu <haoyu.wang@intel.com>
2024-12-26 06:27:45 -08:00
fsygd
637de9e8ce update readme of DeepSeek V3 (#2596) 2024-12-26 21:31:56 +08:00
Yineng Zhang
635a042623 docs: update deepseek v3 example (#2592) 2024-12-26 17:43:37 +08:00