Fix memory leak for chunked prefill 2 (#1858)

Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
This commit is contained in:
Lianmin Zheng
2024-10-31 14:51:51 -07:00
committed by GitHub
parent 8ce202a493
commit a2e0424abf
7 changed files with 138 additions and 30 deletions

View File

@@ -1,7 +1,6 @@
# Guide on Hyperparameter Tuning
## Achieving Peak Throughput
Achieving a large batch size is the most important thing for attaining high throughput.
When the server is running at full load, look for the following in the log: