Shenggui Li
|
c9565e49e7
|
[docker] added rdma support (#3619)
|
2025-02-17 15:36:16 +08:00 |
|
Shi Shuai
|
d03c4c25a7
|
[docs] Update sampling_params.md (#3617)
|
2025-02-16 18:52:30 -08:00 |
|
simveit
|
8f13377dea
|
Draft of updated doc for sampling params. (#3260)
Co-authored-by: shuaills <shishuaicareer@gmail.com>
|
2025-02-16 14:28:22 -08:00 |
|
Mick
|
bcc213df61
|
Model: Support Qwen 2.5 vl (#3258)
|
2025-02-16 00:58:53 -08:00 |
|
Shenggui Li
|
231c40d859
|
[docs] added favicon to sphinx html (#3564)
|
2025-02-15 10:21:21 -08:00 |
|
Mick
|
7711ac6ed0
|
doc: emphasize and notify the usage of chat_template (#3589)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-02-15 00:10:32 -08:00 |
|
Shi Shuai
|
7443197a63
|
[CI] Improve Docs CI Efficiency (#3587)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-14 19:57:00 -08:00 |
|
Yineng Zhang
|
4e23c961e8
|
docs: update install (#3581)
|
2025-02-14 18:54:50 +08:00 |
|
Yineng Zhang
|
31eec35ba8
|
fix doc (#3558)
|
2025-02-14 10:11:31 +08:00 |
|
Yineng Zhang
|
ac963be234
|
update flashinfer-python (#3557)
|
2025-02-14 09:52:56 +08:00 |
|
Yineng Zhang
|
e0b9a423c8
|
chore: bump v0.4.3 (#3556)
|
2025-02-14 09:43:14 +08:00 |
|
simveit
|
368de3661e
|
Update install docs (#3553)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-02-13 13:42:51 -08:00 |
|
Jhin
|
bf2a70872e
|
Update DeepSeek V3 Doc (#3541)
|
2025-02-12 23:15:37 -08:00 |
|
Zachary Streeter
|
8adbc78b30
|
added llama and cleaned up (#3503)
|
2025-02-12 18:48:30 +08:00 |
|
Mick
|
ced680663c
|
doc: Support a new vLM (#3405)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-12 00:43:14 -08:00 |
|
Yineng Zhang
|
2f48221033
|
docs: update install
|
2025-02-12 03:13:31 +08:00 |
|
Zachary Streeter
|
2491cc928d
|
add deepseek-v3 amd docker command (#3495)
|
2025-02-12 03:03:08 +08:00 |
|
Didier Durand
|
67c5de9286
|
fix router typo (#3496)
|
2025-02-12 03:00:57 +08:00 |
|
Didier Durand
|
1e2cf2b541
|
fix server_arguments typo (#3499)
|
2025-02-12 02:59:53 +08:00 |
|
Didier Durand
|
9490d15772
|
fix supported_models Qwen typo (#3498)
|
2025-02-12 02:59:18 +08:00 |
|
Didier Durand
|
eefcbdd353
|
fix deepseek_v3 typo (#3497)
|
2025-02-12 02:58:36 +08:00 |
|
Jackmin801
|
5f0e7de339
|
[Feat] Return hidden states (experimental) (#3364)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-02-10 15:54:37 -08:00 |
|
Yineng Zhang
|
cddb1cdf8f
|
chore: bump v0.4.2.post4 (#3459)
|
2025-02-10 14:12:16 +08:00 |
|
Yineng Zhang
|
27c4c9cf52
|
remove _grouped_size_compiled_for_decode_kernels (#3453)
|
2025-02-10 13:01:21 +08:00 |
|
Ying Sheng
|
52a492a16e
|
Update contribution_guide.md (#3452)
|
2025-02-10 12:53:47 +08:00 |
|
Shi Shuai
|
20cf910d8f
|
[docs] Update quantization documentation (#3437)
Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com>
Co-authored-by: jamessand <shazhizhou0@gmail.com>
|
2025-02-09 10:39:49 -08:00 |
|
Wenxuan Tan
|
0af1d239cb
|
[Docs] Add quantization docs (#3410)
Co-authored-by: yinfan98 <1106310035@qq.com>
|
2025-02-10 02:16:21 +08:00 |
|
Yineng Zhang
|
4d2dbeaca7
|
remove cutex dependency (#3422)
|
2025-02-09 18:33:20 +08:00 |
|
Shi Shuai
|
6702592d0e
|
[docs] Add multi-node inference example for SLURM in documentation (#3408)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
Co-authored-by: aflah02 <aflah20082@iiitd.ac.in>
|
2025-02-08 21:45:14 -08:00 |
|
Zachary Streeter
|
0a6f18f068
|
added amd_configure.md to references (#3275)
Co-authored-by: HAI <hixiao@gmail.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-07 08:50:49 -08:00 |
|
Yineng Zhang
|
c1f5f99f60
|
chore: bump v0.4.2.post3 (#3369)
|
2025-02-07 08:20:03 -08:00 |
|
Shi Shuai
|
591e751e07
|
Fix: Runtime error for function calling (#3300)
|
2025-02-06 20:52:01 -08:00 |
|
Chayenne
|
76ca91dff2
|
Docs/CI: Enable Fake Finish for Docs Only PR (#3350)
|
2025-02-06 19:33:31 -08:00 |
|
Yineng Zhang
|
7aad8d1854
|
chore: bump v0.4.2.post2 (#3313)
|
2025-02-05 17:35:02 +08:00 |
|
Yineng Zhang
|
6186a8f889
|
update flashinfer install index url (#3293)
|
2025-02-05 00:44:35 +08:00 |
|
HAI
|
2c1a695ff1
|
ROCm: sgl-kernel enablement starting with sgl_moe_align_block (#3287)
|
2025-02-04 21:44:44 +08:00 |
|
Baizhou Zhang
|
70817a7eae
|
[Feature] Define backends and add Triton backend for Lora (#3161)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2025-02-03 22:09:13 -08:00 |
|
simveit
|
7b5a374114
|
Update server args doc (#3273)
Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>
|
2025-02-03 23:39:41 +00:00 |
|
Liangjun Song
|
455bfe8dd3
|
Add a Doc about guide on nvidia jetson #3182 (#3205)
Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-02 20:29:10 -08:00 |
|
HAI
|
566d61d90f
|
ROCm: bump 6.3.0 (#3259)
|
2025-02-03 04:13:40 +08:00 |
|
Chayenne
|
55f5fc68ac
|
Docs: Update accuracy evaluation (#3261)
|
2025-02-02 11:14:59 -08:00 |
|
simveit
|
c27c378a19
|
docs/accuracy evaluation (#3114)
Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-02 11:01:39 -08:00 |
|
Wenxuan Tan
|
d7c0b32f4d
|
[Docs] Add more details to profiling docs (#3221)
|
2025-01-31 15:59:28 -08:00 |
|
Jhin
|
656f7fc1bc
|
Docs: Quick fix for Speculative_decoding doc (#3228)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-31 08:30:40 -08:00 |
|
Yineng Zhang
|
cf0f7eafe6
|
chore: bump v0.4.2.post1 (#3233)
|
2025-01-31 20:35:55 +08:00 |
|
Ravi Theja
|
9829e77e3f
|
Docs: Update supported models with Mistral 3 (#3229)
Co-authored-by: Ravi Theja Desetty <ravitheja@Ravis-MacBook-Pro.local>
|
2025-01-31 00:01:46 -08:00 |
|
Mick
|
9f635ea50d
|
[Fix] Address remaining issues of supporting MiniCPMV (#2977)
|
2025-01-28 00:22:13 -08:00 |
|
Jhin
|
7b9b4f4426
|
Docs fix about EAGLE and streaming output (#3166)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu>
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: Jhin <jhinpan@umich.edu>
|
2025-01-27 18:10:45 -08:00 |
|
Yineng Zhang
|
4ab43cfb3e
|
chore: bump v0.4.2 (#3180)
|
2025-01-27 21:42:05 +08:00 |
|
Jhin
|
9472e69963
|
Doc: Add Docs about EAGLE speculative decoding (#3144)
Co-authored-by: Chayenne <zhaochenyang@ucla.edu>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-01-26 17:49:13 -08:00 |
|