Commit Graph

19 Commits

Author SHA1 Message Date
Lianmin Zheng
2d72fc47cf Improve profiler and integrate profiler in bench_one_batch_server (#6787) 2025-05-31 15:53:55 -07:00
Lianmin Zheng
849c83a0c0 [CI] test chunked prefill more (#5798) 2025-04-28 10:57:17 -07:00
fzyzcjy
15ddd84322 Add retry for flaky tests in CI (#4755) 2025-03-25 16:53:12 -07:00
Ke Bao
45212ce18b Add deepseek v2 torch compile pr test (#4538) 2025-03-18 00:29:24 -07:00
Lianmin Zheng
48473684cc Split test_mla.py into two files (#4216) 2025-03-08 15:40:49 -08:00
Lianmin Zheng
d4017a6b63 [EAGLE] many fixes for eagle (#4195)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Sehoon Kim <sehoon@x.ai>
2025-03-07 22:12:13 -08:00
Ke Bao
03b0364f76 Update nextn ci test (#4071) 2025-03-04 13:01:24 -08:00
Lianmin Zheng
ac2387279e Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
2025-03-03 00:12:04 -08:00
Yineng Zhang
e782eb7e6a chore: bump v0.4.3.post1 (#3638) 2025-02-17 21:58:19 +08:00
Yineng Zhang
e319153be8 update unit test (#3636) 2025-02-17 21:06:10 +08:00
Yineng Zhang
32b44d2fca add mtp unit test (#3634) 2025-02-17 19:04:07 +08:00
Yineng Zhang
2b1808cec4 update unit test in AMD CI (#3366) 2025-02-07 17:25:16 +08:00
Ke Bao
5317902670 Add test for fp8 torch compile (#3246) 2025-02-01 16:07:54 +08:00
Lianmin Zheng
67008f4b32 Use only one GPU for MLA CI tests (#2858) 2025-01-13 03:55:33 -08:00
Lianmin Zheng
d4fc1a70e3 Crash the server correctly during error (#2231) 2024-11-28 00:22:39 -08:00
Lianmin Zheng
4af3f889fc Simplify flashinfer indices update for prefill (#2074)
Co-authored-by: kavioyu <kavioyu@tencent.com>
Co-authored-by: kavioyu <kavioyu@gmail.com>
2024-11-18 00:02:36 -08:00
Lianmin Zheng
86fc0d79d0 Add a watch dog thread (#1816) 2024-10-27 02:00:50 -07:00
Ke Bao
b8ccaf4d73 Add MLA gsm8k eval (#1484) 2024-09-21 11:16:13 +08:00
Ke Bao
a68cb201dd Fix triton head num (#1482) 2024-09-21 10:25:20 +08:00