Commit Graph

4 Commits

Author SHA1 Message Date
Lianmin Zheng
22352d47a9 Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632)
Co-authored-by: Kan Wu <wukanustc@gmail.com>
2025-06-29 23:16:19 -07:00
Lianmin Zheng
ac2387279e Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
2025-03-03 00:12:04 -08:00
Lianmin Zheng
93b77c8e8a Fix the request loggings to make it fully able to be easily replayed (#2973) 2025-01-18 21:45:00 -08:00
Lianmin Zheng
46d4431889 Add a new api configure_logging to allow dumping the requests (#2875) 2025-01-13 14:24:00 -08:00