Commit Graph

14 Commits

Author SHA1 Message Date
Stefan He
5db37c8626 [metrics] Add in queue metrics (#4444) 2025-04-09 17:19:27 -07:00
Lianmin Zheng
4ede6770cd Fix retract for page size > 1 (#4914) 2025-03-30 02:57:15 -07:00
Lianmin Zheng
fcc2e37f69 Split the __init__ of scheduler as smaller functions. Improve the eagle tests (#4128) 2025-03-06 00:13:20 -08:00
Lianmin Zheng
66301e124f Improve code styles (#4021) 2025-03-03 03:20:23 -08:00
Lianmin Zheng
ac2387279e Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
2025-03-03 00:12:04 -08:00
Lianmin Zheng
cd493b5afc Improve metrics, logging, and importing orders (#2992) 2025-01-19 18:36:59 -08:00
Lianmin Zheng
51ab3ccf47 Collect more metrics: num_requests_total (#2859) 2025-01-13 03:57:39 -08:00
Lianmin Zheng
bdc1acf6cd Misc fix for min_p_sampling, --cuda-graph-bs (#2761) 2025-01-07 02:52:53 -08:00
Xuehai Pan
62a4a339eb docs: fix module docstrings and copyright headers (#2077) 2024-11-22 22:16:53 +08:00
Lianmin Zheng
1929c06762 Simplify prometheus metrics (#1981)
Co-authored-by: Mohit Reddy <mohitreddy1996@users.noreply.github.com>
2024-11-10 04:39:32 -08:00
Yudi Xue
95a4ed129a Fix metrics (#1963) 2024-11-08 23:21:11 -08:00
Chayenne
c77c1e05ba fix black in pre-commit (#1940) 2024-11-08 07:42:47 +08:00
Xuehai Pan
a5e0defb5a minor: Add basic editorconfig and pre-commit hooks to enforce style for whitespaces (#1926) 2024-11-06 13:46:04 +00:00
Lzhang-hub
a146d9990e support prometheus metrics (#1853)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Byron Hsu <byronhsu1230@gmail.com>
2024-11-05 20:42:53 -08:00