Shenggui Li
|
c6a4852136
|
[docs] added torch.compile cache to dpsk manual (#3737)
|
2025-02-21 00:11:40 -08:00 |
|
Baizhou Zhang
|
ac05310098
|
[Docs] Modify ep related server args and remove cublas part of deepseek (#3732)
|
2025-02-21 03:37:56 +08:00 |
|
Chayenne
|
3c7bfd7eab
|
Docs: Fix layout with sub-section (#3710)
|
2025-02-19 15:44:30 -08:00 |
|
Baizhou Zhang
|
67fc595bb8
|
[Feature] Apply Cublas Grouped Gemm kernel (#3629)
|
2025-02-18 15:18:31 +08:00 |
|
ybyang
|
c51dc2cc8d
|
Docs: Deploy multi-node inference (LWS method) using sglang in a K8s cluster (#3624)
|
2025-02-17 18:14:20 -08:00 |
|
Shenggui Li
|
c9565e49e7
|
[docker] added rdma support (#3619)
|
2025-02-17 15:36:16 +08:00 |
|
Shi Shuai
|
d03c4c25a7
|
[docs] Update sampling_params.md (#3617)
|
2025-02-16 18:52:30 -08:00 |
|
simveit
|
8f13377dea
|
Draft of updated doc for sampling params. (#3260)
Co-authored-by: shuaills <shishuaicareer@gmail.com>
|
2025-02-16 14:28:22 -08:00 |
|
Mick
|
bcc213df61
|
Model: Support Qwen 2.5 vl (#3258)
|
2025-02-16 00:58:53 -08:00 |
|
Jhin
|
bf2a70872e
|
Update DeepSeek V3 Doc (#3541)
|
2025-02-12 23:15:37 -08:00 |
|
Zachary Streeter
|
8adbc78b30
|
added llama and cleaned up (#3503)
|
2025-02-12 18:48:30 +08:00 |
|
Mick
|
ced680663c
|
doc: Support a new vLM (#3405)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-12 00:43:14 -08:00 |
|
Zachary Streeter
|
2491cc928d
|
add deepseek-v3 amd docker command (#3495)
|
2025-02-12 03:03:08 +08:00 |
|
Didier Durand
|
9490d15772
|
fix supported_models Qwen typo (#3498)
|
2025-02-12 02:59:18 +08:00 |
|
Didier Durand
|
eefcbdd353
|
fix deepseek_v3 typo (#3497)
|
2025-02-12 02:58:36 +08:00 |
|
Ying Sheng
|
52a492a16e
|
Update contribution_guide.md (#3452)
|
2025-02-10 12:53:47 +08:00 |
|
Shi Shuai
|
20cf910d8f
|
[docs] Update quantization documentation (#3437)
Co-authored-by: zhaochenyang20 <zhaochenyang20@gmail.com>
Co-authored-by: jamessand <shazhizhou0@gmail.com>
|
2025-02-09 10:39:49 -08:00 |
|
Wenxuan Tan
|
0af1d239cb
|
[Docs] Add quantization docs (#3410)
Co-authored-by: yinfan98 <1106310035@qq.com>
|
2025-02-10 02:16:21 +08:00 |
|
Shi Shuai
|
6702592d0e
|
[docs] Add multi-node inference example for SLURM in documentation (#3408)
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
Co-authored-by: aflah02 <aflah20082@iiitd.ac.in>
|
2025-02-08 21:45:14 -08:00 |
|
Zachary Streeter
|
0a6f18f068
|
added amd_configure.md to references (#3275)
Co-authored-by: HAI <hixiao@gmail.com>
Co-authored-by: Yineng Zhang <me@zhyncs.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-07 08:50:49 -08:00 |
|
Shi Shuai
|
591e751e07
|
Fix: Runtime error for function calling (#3300)
|
2025-02-06 20:52:01 -08:00 |
|
Chayenne
|
76ca91dff2
|
Docs/CI: Enable Fake Finish for Docs Only PR (#3350)
|
2025-02-06 19:33:31 -08:00 |
|
Liangjun Song
|
455bfe8dd3
|
Add a Doc about guide on nvidia jetson #3182 (#3205)
Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-02 20:29:10 -08:00 |
|
Chayenne
|
55f5fc68ac
|
Docs: Update accuracy evaluation (#3261)
|
2025-02-02 11:14:59 -08:00 |
|
simveit
|
c27c378a19
|
docs/accuracy evaluation (#3114)
Co-authored-by: Shi Shuai <126407087+shuaills@users.noreply.github.com>
Co-authored-by: zhaochenyang20 <zhaochen20@outlook.com>
|
2025-02-02 11:01:39 -08:00 |
|
Wenxuan Tan
|
d7c0b32f4d
|
[Docs] Add more details to profiling docs (#3221)
|
2025-01-31 15:59:28 -08:00 |
|
Ravi Theja
|
9829e77e3f
|
Docs: Update supported models with Mistral 3 (#3229)
Co-authored-by: Ravi Theja Desetty <ravitheja@Ravis-MacBook-Pro.local>
|
2025-01-31 00:01:46 -08:00 |
|
Mick
|
9f635ea50d
|
[Fix] Address remaining issues of supporting MiniCPMV (#2977)
|
2025-01-28 00:22:13 -08:00 |
|
Adarsh Shirawalmath
|
4505a43614
|
[Docs] minor update for phi-3 and phi-4 (#3096)
|
2025-01-24 04:00:20 -08:00 |
|
Baizhou Zhang
|
b3393e941f
|
[Doc] Update doc of profiling with PyTorch Profiler (#3038)
|
2025-01-22 14:17:26 -08:00 |
|
Hongpeng Guo
|
949b3fbfce
|
[Doc] Update doc of custom logit processor (#3021)
Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>
|
2025-01-20 16:50:25 -08:00 |
|
Chayenne
|
2584f6d944
|
Docs: Add Performance Demonstaration for DPA (#3005)
|
2025-01-20 01:00:52 -08:00 |
|
Lianmin Zheng
|
03464890e0
|
Separate two entry points: Engine and HTTP server (#2996)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
|
2025-01-19 22:09:24 -08:00 |
|
Enrique Shockwave
|
3bcf5ecea7
|
support regex in xgrammar backend (#2983)
|
2025-01-20 04:34:41 +08:00 |
|
Yineng Zhang
|
def5c31873
|
docs: update supported_models (#2987)
|
2025-01-20 00:44:30 +08:00 |
|
Mick
|
3d93f84a00
|
[Feature] Support minicpmv v2.6 (#2785)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2025-01-18 14:14:19 -08:00 |
|
Wen Sun
|
120c3634ef
|
Fix Llama-3.1-405B References Docs (#2944)
|
2025-01-17 14:46:38 -08:00 |
|
Lianmin Zheng
|
8b6ce52e92
|
Support multi-node DP attention (#2925)
Co-authored-by: dhou-xai <dhou@x.ai>
|
2025-01-16 11:15:00 -08:00 |
|
Yineng Zhang
|
41d7e5b7e6
|
docs: update link (#2857)
|
2025-01-13 18:40:48 +08:00 |
|
Lianmin Zheng
|
72c7776355
|
Fix linear.py and improve weight loading (#2851)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
|
2025-01-13 01:39:14 -08:00 |
|
Shi Shuai
|
c4f9707e16
|
Improve: Token-In Token-Out Usage for RLHF (#2843)
|
2025-01-11 15:14:26 -08:00 |
|
Chayenne
|
5cc1170552
|
Doc: add block-wise FP8 in dpsk model reference (#2830)
|
2025-01-10 00:26:59 -08:00 |
|
Xiaotong Jiang
|
11fffbc95a
|
[Doc]: Deepseek reference docs (#2787)
|
2025-01-09 13:43:12 -08:00 |
|
Chayenne
|
2e6346fc2e
|
Docs:Update the style of llma 3.1 405B docs (#2789)
|
2025-01-08 01:07:54 -08:00 |
|
mlmz
|
977f785dad
|
Docs: Rewrite docs for LLama 405B and ModelSpace (#2773)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-08 00:02:59 -08:00 |
|
Lianmin Zheng
|
0f9cc6d8d3
|
Fix package loss for small models (#2717)
Co-authored-by: sdli1995 < mmlmonkey@163.com>
|
2025-01-02 18:25:26 -08:00 |
|
Shi Shuai
|
dd2e2d275f
|
Docs: Update documentation workflow and contribution guide (#2704)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-01-02 09:18:31 -08:00 |
|
Shi Shuai
|
062c48d2bd
|
[Docs] Add Support for Pydantic Structured Output Format (#2697)
|
2025-01-01 15:08:43 -08:00 |
|
Shi Shuai
|
0a765bbccc
|
Docs: Refactor Contribution Guide (#2690)
|
2024-12-31 14:11:00 -08:00 |
|
Lianmin Zheng
|
8c3b420eec
|
[Docs] clean up structured outputs docs (#2654)
|
2024-12-29 23:57:16 -08:00 |
|