Commit Graph

25 Commits

Author SHA1 Message Date
Serge Panev
2b1da821b5 [NVIDIA] Add new SMs support for Spark & Thor (#11287)
Signed-off-by: Serge Panev <spanev@nvidia.com>
2025-10-22 02:02:24 +08:00
b8zhong
d0a64c7e2c vlm: enforce pybase64 for image and str encode/decode (#10700) 2025-10-21 19:05:32 +08:00
Zhengke Zhou
260fe755b6 Simplify multi-tokenizer (#11295)
Signed-off-by: zhengkezhou1 <madzhou1@gmail.com>
Co-authored-by: Liangsheng Yin <lsyincs@gmail.com>
2025-10-21 16:33:29 +08:00
Meng, Hengyu
b113c72e7a Init attention backend for Intel XPU (#10656)
Co-authored-by: guangyey <guangye.yu@intel.com>
Co-authored-by: DiweiSun <105627594+DiweiSun@users.noreply.github.com>
2025-10-21 11:41:28 +08:00
Shane A
d383e6616e [Model] Add Olmo 3 model support (#11396) 2025-10-19 23:59:16 -07:00
Liangsheng Yin
48738af7f9 [CI] always print back trace in retry() (#11834) 2025-10-20 01:12:49 +08:00
Liangsheng Yin
57e25de756 Revert "Fix: Dynamic RoPE Cache Expansion to Prevent Position-ID Out-of-Bounds in EAGLE + Long-Sequence Workloads" (#11827) 2025-10-19 19:44:06 +08:00
YAMY
80407b0493 Fix: Dynamic RoPE Cache Expansion to Prevent Position-ID Out-of-Bounds in EAGLE + Long-Sequence Workloads (#10788) 2025-10-19 11:37:43 +08:00
Minglei Zhu
f4488e9dd9 set default attention backend for deterministic inference (#11801) 2025-10-18 00:01:24 -07:00
Lianmin Zheng
9eefe2c0b7 Set CUDA_VISIBLE_DEVICES to achieve one GPU per process (#9170)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Cheng Wan <cwan@x.ai>
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
2025-10-17 17:30:06 -07:00
Chang Su
627974405d [Lint] Add python/sglang to ruff F401 checks and remove unused imports in files (#11685) 2025-10-17 16:49:46 -07:00
Lianmin Zheng
fdd7c69d65 [Auto Sync] Update common.py (20251017) (#11782)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Cheng Wan <54331508+ch-wan@users.noreply.github.com>
2025-10-17 15:03:42 -07:00
Chunyuan WU
8fcc69e7c4 Turn on shm_allreduce and shm_allgather for fp16 (#10725) 2025-10-17 12:35:20 -07:00
Baizhou Zhang
b0d1d717e1 Revert "make radix cache deterministic" (#11728) 2025-10-16 14:36:15 -07:00
Alex Chi Z
dc965db0e0 make radix cache deterministic (#10721)
Signed-off-by: Alex Chi Z <iskyzh@gmail.com>
2025-10-14 21:01:52 +08:00
Yongtong Wu
a20e7df8d0 Improve dp attention port assignment scheme (#5889)
Co-authored-by: Cheng Wan <cwan@x.ai>
2025-10-12 17:55:59 -07:00
Vincent Zhong
a220536f40 [ perf ] Replace json-> orjson in hot path (#11221)
Signed-off-by: vincentzed <207368749+vincentzed@users.noreply.github.com>
2025-10-12 20:30:58 +08:00
Kai-Hsun Chen
43190becfa [chore][1/N] Avoid using default mutable parameters (#11478)
Signed-off-by: Kai-Hsun Chen <khchen@x.ai>
2025-10-12 20:26:39 +08:00
Liu-congo
c80a96dae9 [BugFix] test_mla_fp8.py fails on Cublas 12.9 (#11360)
Signed-off-by: Liu-congo <1502632128@qq.com>
2025-10-10 21:14:24 -07:00
Lianmin Zheng
61055cb309 Reorder PD disagg CI tests (#11438) 2025-10-10 17:56:49 -07:00
Yingchun Lai
0fe87213bb fix: fix gpu-proc affinity set incorrectly when pp_size > 1 (#11389) 2025-10-09 18:40:05 -07:00
Lianmin Zheng
9b8ebb2798 move more files under srt/utils (#11285) 2025-10-09 16:46:15 -07:00
Netanel Haber
d6837aea4d model: Support Hybrid Mamba2 NemotronHForCausalLM (nvidia/NVIDIA-Nemotron-Nano-9B-v2) (#10909)
Signed-off-by: Netanel Haber <nhaber@nvidia.com>
2025-10-09 00:37:38 +08:00
fzyzcjy
efbc687c28 Support DeepSeek V3.2 Exp (#11061)
Co-authored-by: Stefan He <11166516+hebiao064@users.noreply.github.com>
Co-authored-by: Liangsheng Yin <95566987+hnyls2002@users.noreply.github.com>
Co-authored-by: Baizhou Zhang <56809903+fridge003@users.noreply.github.com>
Co-authored-by: DarkSharpness <76582120+darksharpness@users.noreply.github.com>
Co-authored-by: ZhengdQin <46387172+zhengdqin@users.noreply.github.com>
Co-authored-by: DarkSharpness <2040703891@qq.com>
Co-authored-by: hnyls2002 <lsyincs@gmail.com>
Co-authored-by: Zhengda Qin <zhengdqin@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: HAI <hixiao@gmail.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
2025-10-06 00:24:15 -07:00
fzyzcjy
fdc4e1e570 Tiny move files to utils folder (#11166) 2025-10-03 22:40:06 +08:00