Lianmin Zheng
|
22352d47a9
|
Improve streaming, log_level, memory report, weight loading, and benchmark script (#7632)
Co-authored-by: Kan Wu <wukanustc@gmail.com>
|
2025-06-29 23:16:19 -07:00 |
|
Lianmin Zheng
|
ac2387279e
|
Support penalty in overlap mode; return logprob with chunked prefill; improve benchmark scripts (#3988)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: dhou-xai <dhou@x.ai>
Co-authored-by: Hanming Lu <hanming_lu@berkeley.edu>
|
2025-03-03 00:12:04 -08:00 |
|
Lianmin Zheng
|
93b77c8e8a
|
Fix the request loggings to make it fully able to be easily replayed (#2973)
|
2025-01-18 21:45:00 -08:00 |
|
Lianmin Zheng
|
46d4431889
|
Add a new api configure_logging to allow dumping the requests (#2875)
|
2025-01-13 14:24:00 -08:00 |
|