Simo Lin
|
7c94eaeeb0
|
[router] allow tokenizer path to be dir (#11530)
|
2025-10-13 09:30:09 -04:00 |
|
Simo Lin
|
13d596c93e
|
[router][ci] Add Nightly Release Workflow for SGLang Router (#11527)
|
2025-10-13 09:28:55 -04:00 |
|
Mohammad Miadh Angkad
|
c7867b6702
|
[Fix] Add per_channel_quant parameter to MoE config functions (#11201)
|
2025-10-13 21:26:06 +08:00 |
|
Liangsheng Yin
|
516738b096
|
Depreate global_server_args_dict (#11528)
|
2025-10-13 19:34:43 +08:00 |
|
Yuan Luo
|
0b6f535f66
|
[Reland] perf: optimize qwen-vl with symm mem allreduce (#11457)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
|
2025-10-13 17:51:25 +08:00 |
|
Shangming Cai
|
c5fe3c0b75
|
Tiny fix test run estimated time (#11544)
Signed-off-by: Shangming Cai <csmthu@gmail.com>
|
2025-10-13 02:23:13 -07:00 |
|
hzh0425
|
318424e2c8
|
[HICache]: Support 3FS-Store with page_first_direct layout (#11460)
|
2025-10-13 15:47:22 +08:00 |
|
Xiaoyu Zhang
|
6806c4e63e
|
[CI monitor] Improve CI analyzer: fix job failure tracking and add CUDA-focused filtering (#11505)
|
2025-10-13 13:31:09 +08:00 |
|
Mick
|
0c0779d667
|
ci: improve nightly-ci (#11385)
|
2025-10-12 21:19:34 -07:00 |
|
Yi Zhang
|
a55cf5304a
|
[Feature] Support mamba radix cache v0 (#11214)
Co-authored-by: hanming-lu <hanming@x.ai>
Co-authored-by: hzh0425 <hzh0425@apache.org>
Co-authored-by: thalahors <ericalcaide1@gmail.com>
|
2025-10-12 20:57:15 -07:00 |
|
Yuanhang Sun
|
19ba16aa3d
|
[Fix]: add missing device attribute to ChunkCache (#11493)
|
2025-10-12 20:49:59 -07:00 |
|
Qiaolin Yu
|
a2b3d9b90b
|
Update DeepSeek-R1-FP4 default config on blackwell (#11512)
|
2025-10-12 20:32:11 -07:00 |
|
Qi Yuhang
|
9a30914e94
|
[sgl-kernel][1/N]Support Expert Specialization Grouped GEMM (#11432)
Co-authored-by: luoyuan.luo <luoyuan.luo@antgroup.com>
Co-authored-by: PGFLMG <1106310035@qq.com>
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
|
2025-10-12 20:19:21 -07:00 |
|
Jonah Bernard
|
8e776c78a1
|
docs(router): add token-bucket rate limiting to the docs (#11485)
|
2025-10-12 20:03:27 -07:00 |
|
Keyang Ru
|
63e84352b7
|
[router] openai router: support grok model (#11511)
|
2025-10-12 22:44:43 -04:00 |
|
Yongtong Wu
|
a20e7df8d0
|
Improve dp attention port assignment scheme (#5889)
Co-authored-by: Cheng Wan <cwan@x.ai>
|
2025-10-12 17:55:59 -07:00 |
|
Cheng Wan
|
1bdd010291
|
Revert "Deprecate global_server_args_dict" (#11520)
|
2025-10-12 17:40:40 -07:00 |
|
Cheng Wan
|
6cd296940a
|
[lint] Fix the lint issue (#11516)
|
2025-10-12 16:22:46 -07:00 |
|
Lianmin Zheng
|
2ac46e94ef
|
Sync changes on io_struct.py and deterministic ops (#11498)
|
2025-10-12 16:03:10 -07:00 |
|
Binyao Jiang
|
0aa65f94f1
|
[Fix] Improve longbench prompt and other logics (#11474)
|
2025-10-12 15:04:28 -07:00 |
|
Yineng Zhang
|
0ecb42613d
|
fix: revert temporarily remove b200 tests (#11515)
|
2025-10-12 15:02:37 -07:00 |
|
Yineng Zhang
|
05f015f65f
|
chore: remove flashinfer cleanup cache (#11514)
|
2025-10-12 14:56:33 -07:00 |
|
Liangsheng Yin
|
1083e7e3df
|
Deprecate global_server_args_dict (#11331)
|
2025-10-13 01:20:47 +08:00 |
|
Liangsheng Yin
|
2157d12ae8
|
[CI] fix lint (#11509)
|
2025-10-13 01:07:21 +08:00 |
|
Mick
|
9f2b457cbe
|
doc: add doc for adding new models into nightly-ci (#11443)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2025-10-12 08:35:10 -07:00 |
|
hzh0425
|
f5b34a510c
|
Bugfix: Fix Type consistency for KV indices in SWARadixCache (#11452)
Co-authored-by: yizhang2077 <1109276519@qq.com>
|
2025-10-12 23:19:44 +08:00 |
|
Lianmin Zheng
|
5a6ec8f999
|
Fix unit tests (#11503)
|
2025-10-12 07:45:57 -07:00 |
|
Lianmin Zheng
|
6a653bb11b
|
temporarily remove b200 tests (#11502)
|
2025-10-12 06:48:49 -07:00 |
|
Lianmin Zheng
|
548a57b1f3
|
Fix port conflicts in CI (#11497)
|
2025-10-12 06:46:36 -07:00 |
|
Lianmin Zheng
|
88e73ed048
|
Temporarily remove b200 tests (#11501)
|
2025-10-12 06:41:37 -07:00 |
|
Yi Zhang
|
4b15fa00f0
|
move fla env check position (#11500)
|
2025-10-12 06:40:45 -07:00 |
|
Liangsheng Yin
|
f49419061d
|
Move args from global_config to environ (#11332)
|
2025-10-12 21:29:31 +08:00 |
|
Liangsheng Yin
|
01e59e8247
|
Fix CI break by express-laned PRs. (#11499)
|
2025-10-12 21:06:06 +08:00 |
|
Mike Qiu
|
99a0704a36
|
bailingMoE: Fix Key error of deepep_mode (#11465)
Signed-off-by: Michael Qiu <qiudayu.qdy@antgroup.com>
Co-authored-by: Mike_Qiu <qiudayu.qdy@antgroup.com>
|
2025-10-12 20:42:59 +08:00 |
|
Antoine Roux
|
ec1cd90ac9
|
Fix the GPT function calling regex to allow dash in the name (#10577)
|
2025-10-12 20:34:58 +08:00 |
|
Kai-Hsun Chen
|
1103dc6204
|
[chore][2/N] Avoid using default mutable parameters (#11479)
Signed-off-by: Kai-Hsun Chen <khchen@x.ai>
|
2025-10-12 20:34:04 +08:00 |
|
Vincent Zhong
|
a220536f40
|
[ perf ] Replace json-> orjson in hot path (#11221)
Signed-off-by: vincentzed <207368749+vincentzed@users.noreply.github.com>
|
2025-10-12 20:30:58 +08:00 |
|
Mahmoud Ashraf
|
7b064f04f8
|
[bugfix]: use correct causality condition for flashattention, flashinfer, and triton backends (#10172)
|
2025-10-12 20:28:16 +08:00 |
|
Kai-Hsun Chen
|
43190becfa
|
[chore][1/N] Avoid using default mutable parameters (#11478)
Signed-off-by: Kai-Hsun Chen <khchen@x.ai>
|
2025-10-12 20:26:39 +08:00 |
|
Vincent Zhong
|
be740acdb0
|
[smol] [perf] Qwen3-VL in place op. (#11481)
Signed-off-by: vincentzed <207368749+vincentzed@users.noreply.github.com>
|
2025-10-12 20:25:30 +08:00 |
|
sglang-bot
|
2db2cddd12
|
chore: bump sgl-kernel version to 0.3.16 (#11476)
|
2025-10-11 22:04:49 -07:00 |
|
Wenyi Xu
|
9b5efe3464
|
[Router]: Small Typo in a comment within tree.rs (#11489)
|
2025-10-11 21:59:48 -07:00 |
|
Yuwei An
|
4ac8e09df0
|
Piecewise CUDA Graph Support & Torch Compile Backend (#10062)
Signed-off-by: Oasis-Git <ayw.sirius19@gmail.com>
|
2025-10-12 11:55:57 +08:00 |
|
Liangsheng Yin
|
20a6c0a63d
|
Beta spec-overlap for EAGLE (#11398)
Co-authored-by: Lianmin Zheng <15100009+merrymercy@users.noreply.github.com>
Co-authored-by: Hanming Lu <69857889+hanming-lu@users.noreply.github.com>
|
2025-10-12 11:02:22 +08:00 |
|
Glen Liu
|
47c606d3dc
|
[Feature] support regex strings as a stopping condition (#10635)
|
2025-10-12 10:53:15 +08:00 |
|
Sahithi Chigurupati
|
9fcf73069f
|
[CI] Add nightly builds to dockerhub (#9804)
Signed-off-by: Sahithi Chigurupati <chigurupati.sahithi@gmail.com>
|
2025-10-11 18:27:46 -07:00 |
|
Zaili Wang
|
0a304870e8
|
fix Xeon CI (#11454)
|
2025-10-11 14:08:28 -07:00 |
|
PGFLMG
|
8fdcd98efe
|
[7/n] decouple quantization impl from vllm dependency - gguf kernel (#11019)
|
2025-10-11 14:04:57 -07:00 |
|
Lorenzo Lu
|
b5dcfd4154
|
Add option to disable any_whitespace for xgrammar and llguidance backends. (#8919)
Co-authored-by: Chang Su <chang.s.su@oracle.com>
|
2025-10-11 22:24:58 +08:00 |
|
ybyang
|
5061b8fd3e
|
fix stop when stream (#11462)
Signed-off-by: ybyang <ybyang7@iflytek.com>
Co-authored-by: Liangsheng Yin <lsyincs@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
|
2025-10-11 22:06:31 +08:00 |
|