Shangming Cai
|
2ff572e28c
|
[CI][Router] Fix bench_one_batch_server for pd router test (#7731)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-07-02 23:18:24 -07:00 |
|
Lianmin Zheng
|
22352d47a9
|
Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632)
Co-authored-by: Kan Wu <wukanustc@gmail.com>
|
2025-06-29 23:16:19 -07:00 |
|
Lianmin Zheng
|
2d72fc47cf
|
Improve profiler and integrate profiler in bench_one_batch_server (#6787)
|
2025-05-31 15:53:55 -07:00 |
|
fzyzcjy
|
55f6005f53
|
Fix bench_one_batch_server (#6503)
|
2025-05-21 11:08:17 -07:00 |
|
fzyzcjy
|
7222e1dacc
|
Let bench_one_batch_server use sharegpt data to make expert distribution more natural (#5573)
|
2025-05-21 02:08:43 -07:00 |
|
Lifu Huang
|
3cf1473a09
|
Use monotonic clock for interval measurement (#6211)
Signed-off-by: Lifu Huang <lifu.hlf@gmail.com>
|
2025-05-17 16:49:18 -07:00 |
|
Lianmin Zheng
|
fba8eccd7e
|
Log if cuda graph is used & extend cuda graph capture to cuda-graph-max-bs (#6201)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
|
2025-05-12 00:17:33 -07:00 |
|
Lianmin Zheng
|
005aad32ad
|
Revert "[fix] fix bench_one_batch_server" (#5785)
|
2025-04-27 03:48:33 -07:00 |
|
JieXin Liang
|
3c4dc38a9a
|
[fix] fix bench_one_batch_server (#5607)
|
2025-04-26 18:49:45 -07:00 |
|
fzyzcjy
|
f01b092519
|
Super tiny fix typo (#4738)
|
2025-03-24 21:05:45 -07:00 |
|
Lianmin Zheng
|
03464890e0
|
Separate two entry points: Engine and HTTP server (#2996)
Co-authored-by: fzyzcjy <5236035+fzyzcjy@users.noreply.github.com>
|
2025-01-19 22:09:24 -08:00 |
|
Lianmin Zheng
|
d4fc1a70e3
|
Crash the server correctly during error (#2231)
|
2024-11-28 00:22:39 -08:00 |
|
Lianmin Zheng
|
fb6e04a0c2
|
Use an env var SGLANG_SET_CPU_AFFINITY to set cpu affinity; turn it off by default (#2222)
|
2024-11-27 02:52:46 -08:00 |
|
Lianmin Zheng
|
6997e28f6e
|
Revert "Use an env var SGLANG_SET_CPU_AFFINITY to set cpu affinity; turn it off by default" (#2221)
|
2024-11-27 02:02:01 -08:00 |
|
Lianmin Zheng
|
a0e58740a8
|
Use an env var SGLANG_SET_CPU_AFFINITY to set cpu affinity; turn it off by default (#2217)
|
2024-11-27 01:13:41 -08:00 |
|
Lianmin Zheng
|
dfec7fca06
|
Rename sglang.bench_latency to sglang.bench_one_batch (#2118)
|
2024-11-21 20:07:48 -08:00 |
|