Yineng Zhang
|
7e257cd666
|
chore: bump v0.4.6.post5 (#6566)
|
2025-05-24 00:48:05 -07:00 |
|
Yineng Zhang
|
16267d4fa7
|
chore: bump v0.4.6.post4 (#6245)
|
2025-05-13 01:57:51 -07:00 |
|
Lianmin Zheng
|
e8e18dcdcc
|
Revert "fix some typos" (#6244)
|
2025-05-12 12:53:26 -07:00 |
|
applesaucethebun
|
d738ab52f8
|
fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-13 01:42:38 +08:00 |
|
Yineng Zhang
|
678d8cc987
|
chore: bump v0.4.6.post3 (#6165)
|
2025-05-09 15:38:47 -07:00 |
|
Yineng Zhang
|
9858113c33
|
chore: bump v0.4.6.post2 (#5939)
|
2025-04-30 22:04:40 -07:00 |
|
Yineng Zhang
|
dcae1fb2cd
|
chore: bump v0.4.6.post1 (#5845)
|
2025-04-28 12:57:08 -07:00 |
|
Baizhou Zhang
|
84022c0e56
|
Release v0.4.6 (#5795)
|
2025-04-27 14:07:05 -07:00 |
|
Yineng Zhang
|
b9c87e781d
|
chore: bump v0.4.5.post3 (#5611)
|
2025-04-21 18:16:20 -07:00 |
|
lukec
|
417b44eba8
|
[Feat] upgrade pytorch2.6 (#5417)
|
2025-04-20 16:06:34 -07:00 |
|
AniZpZ
|
d95269f9b3
|
[2/3] fix dsv3 awq issue (#4625)
Co-authored-by: 晟海 <huangtingwei.htw@antgroup.com>
Co-authored-by: laixinn <xielx@shanghaitech.edu.cn>
|
2025-04-03 17:36:39 -07:00 |
|
Wenbo Yang
|
75b656488a
|
Support serving DeepSeek-R1-Channel-INT8 with 32 L40S. (#4418)
|
2025-03-17 00:03:43 -07:00 |
|
Zhan Lu
|
660305c38a
|
[Doc] fix wrong flag in deepseek documentation (#4427)
|
2025-03-14 11:30:55 -07:00 |
|
laixin
|
0c02086015
|
add INT8 example into dsv3 README (#4079)
|
2025-03-12 21:37:30 -07:00 |
|
lukec
|
ffa1b3e318
|
Add an example of using deepseekv3 int8 sglang. (#4177)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-03-07 01:56:09 -08:00 |
|
Yineng Zhang
|
5d86016855
|
revert "Docs: Reorngaize dpsk links #3900" (#3933)
|
2025-02-27 08:57:13 -08:00 |
|
Chayenne
|
7c1692aa90
|
Docs: Reorngaize dpsk links (#3900)
|
2025-02-26 15:16:31 -08:00 |
|
Zhanghao Wu
|
f93e915817
|
[Docs] Add SkyPilot DeepSeek example (#3706)
|
2025-02-20 02:10:23 +08:00 |
|
Yineng Zhang
|
fe0673f1cc
|
set NCCL_IB_GID_INDEX=3 for multi node NVIDIA InfiniBand if needed (#3698)
|
2025-02-19 20:50:22 +08:00 |
|
Shenggui Li
|
c9565e49e7
|
[docker] added rdma support (#3619)
|
2025-02-17 15:36:16 +08:00 |
|
Yineng Zhang
|
ac963be234
|
update flashinfer-python (#3557)
|
2025-02-14 09:52:56 +08:00 |
|
Yineng Zhang
|
e0b9a423c8
|
chore: bump v0.4.3 (#3556)
|
2025-02-14 09:43:14 +08:00 |
|
Yineng Zhang
|
20de05a753
|
update README (#3543)
|
2025-02-13 17:22:11 +08:00 |
|
Jhin
|
bf2a70872e
|
Update DeepSeek V3 Doc (#3541)
|
2025-02-12 23:15:37 -08:00 |
|
Xiaoyu Zhang
|
693c2600e0
|
refine deepseek_v3 launch server doc (#3522)
|
2025-02-12 17:27:07 +08:00 |
|
Yineng Zhang
|
cddb1cdf8f
|
chore: bump v0.4.2.post4 (#3459)
|
2025-02-10 14:12:16 +08:00 |
|
Yineng Zhang
|
f90db8bc07
|
fix typo
|
2025-02-08 22:16:42 +08:00 |
|
Ke Bao
|
d8ad597048
|
Add deepseek-v3 a100 serving example (#3404)
|
2025-02-08 22:13:52 +08:00 |
|
Yineng Zhang
|
c1f5f99f60
|
chore: bump v0.4.2.post3 (#3369)
|
2025-02-07 08:20:03 -08:00 |
|
Ke Bao
|
6792411e7f
|
[Doc] Add optimization option guide for deepseek v3 (#3349)
|
2025-02-06 23:28:09 +08:00 |
|
Yineng Zhang
|
7348d9627e
|
add AMD guide for DeepSeek-R1 (#3338)
|
2025-02-06 16:54:40 +08:00 |
|
Yineng Zhang
|
07e58a2dcb
|
update README (#3324)
|
2025-02-06 07:13:05 +08:00 |
|
Yineng Zhang
|
80002562a8
|
docs: update README (#2878)
|
2025-01-14 12:48:17 +08:00 |
|
Yineng Zhang
|
41d7e5b7e6
|
docs: update link (#2857)
|
2025-01-13 18:40:48 +08:00 |
|
Lianmin Zheng
|
72c7776355
|
Fix linear.py and improve weight loading (#2851)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
|
2025-01-13 01:39:14 -08:00 |
|
Yineng Zhang
|
197cbf9bab
|
docs: update README (#2841)
|
2025-01-11 23:11:38 +08:00 |
|
Yineng Zhang
|
f624901cdd
|
chore: bump v0.4.1.post5 (#2840)
|
2025-01-11 23:10:02 +08:00 |
|
Rodrigo Garcia
|
a990daff9c
|
Included multi-node DeepSeekv3 example (#2707)
|
2025-01-02 22:17:03 +08:00 |
|
Lianmin Zheng
|
ad20b7957e
|
Eagle speculative decoding part 3: small modifications to the general scheduler (#2709)
Co-authored-by: kavioyu <kavioyu@tencent.com>
|
2025-01-02 02:09:08 -08:00 |
|
Lianmin Zheng
|
8c3b420eec
|
[Docs] clean up structured outputs docs (#2654)
|
2024-12-29 23:57:16 -08:00 |
|
Yineng Zhang
|
098d659c0e
|
docs: update README (#2651)
|
2024-12-30 13:33:29 +08:00 |
|
Lzhang-hub
|
76d14f8cb9
|
add 2*h20 node serving example for deepseek v3 (#2650)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
|
2024-12-30 13:04:38 +08:00 |
|
Lianmin Zheng
|
03d5fbfd44
|
Release 0.4.1.post3 - upload the config.json to PyPI (#2647)
|
2024-12-29 14:25:53 -08:00 |
|
Yineng Zhang
|
763dd55d17
|
docs: update README (#2644)
|
2024-12-30 01:24:06 +08:00 |
|
Ke Bao
|
8a2681e26a
|
Update readme (#2625)
|
2024-12-28 13:39:56 +08:00 |
|
Yineng Zhang
|
d9e6ee382b
|
docs: update README (#2618)
|
2024-12-28 00:21:53 +08:00 |
|
Lianmin Zheng
|
f46f394f4d
|
Update README.md (#2605)
|
2024-12-26 10:58:49 -08:00 |
|
Lianmin Zheng
|
773951548d
|
Fix logprob_start_len for multi modal models (#2597)
Co-authored-by: libra <lihu723@gmail.com>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: Wang, Haoyu <haoyu.wang@intel.com>
|
2024-12-26 06:27:45 -08:00 |
|
fsygd
|
637de9e8ce
|
update readme of DeepSeek V3 (#2596)
|
2024-12-26 21:31:56 +08:00 |
|
Yineng Zhang
|
635a042623
|
docs: update deepseek v3 example (#2592)
|
2024-12-26 17:43:37 +08:00 |
|