Commit Graph

16 Commits

Author SHA1 Message Date
Ying Sheng
4367f4bb8d Fix prefill size (#711) 2024-07-24 03:41:15 -07:00
Liangsheng Yin
69d19188fc Decouple kv (#679) 2024-07-20 14:16:45 -07:00
zhyncs
d93388da3e feat: add check_env (#645) 2024-07-17 21:39:28 -07:00
Ying Sheng
476584cb6e Increase the capacity of the memory pool (#643) 2024-07-17 15:44:41 -07:00
zhyncs
2e341cd493 misc: add pre-commit config (#637) 2024-07-17 11:55:39 -07:00
Lianmin Zheng
41d1f67704 Fix flush cache (#627) 2024-07-15 20:44:04 -07:00
Mingyi
5ac8b80677 Simplify mem state (#623) 2024-07-15 02:01:09 -07:00
Liangsheng Yin
a56858ba67 Unify index operations (#620) 2024-07-14 12:55:55 -07:00
Liangsheng Yin
564a898ad9 Optimize mem indices mangement (#619) 2024-07-13 23:39:37 -07:00
Ying Sheng
5949b1ca0e Fix memory pool index error (#616) 2024-07-13 16:45:11 -07:00
Lianmin Zheng
0feca02dd9 Improve benchmark scripts (#615) 2024-07-13 15:59:04 -07:00
Liangsheng Yin
10143e1a5f Memorypool chunked prefetch (#614) 2024-07-13 15:24:03 -07:00
Lianmin Zheng
665815969a Enable cuda graph by default (#612) 2024-07-13 05:29:46 -07:00
Liangsheng Yin
19818b9c2f Minor: style improvement of radix_cache and memory_pool (#395) 2024-04-26 01:01:36 +08:00
Lianmin Zheng
c51020cf0c Fix the chat template for llava-v1.6-34b & format code (#177) 2024-02-11 05:50:13 -08:00
Lianmin Zheng
22085081bb release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com>
Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
2024-01-08 04:37:50 +00:00