Ying Sheng
|
4367f4bb8d
|
Fix prefill size (#711)
|
2024-07-24 03:41:15 -07:00 |
|
Liangsheng Yin
|
69d19188fc
|
Decouple kv (#679)
|
2024-07-20 14:16:45 -07:00 |
|
zhyncs
|
d93388da3e
|
feat: add check_env (#645)
|
2024-07-17 21:39:28 -07:00 |
|
Ying Sheng
|
476584cb6e
|
Increase the capacity of the memory pool (#643)
|
2024-07-17 15:44:41 -07:00 |
|
zhyncs
|
2e341cd493
|
misc: add pre-commit config (#637)
|
2024-07-17 11:55:39 -07:00 |
|
Lianmin Zheng
|
41d1f67704
|
Fix flush cache (#627)
|
2024-07-15 20:44:04 -07:00 |
|
Mingyi
|
5ac8b80677
|
Simplify mem state (#623)
|
2024-07-15 02:01:09 -07:00 |
|
Liangsheng Yin
|
a56858ba67
|
Unify index operations (#620)
|
2024-07-14 12:55:55 -07:00 |
|
Liangsheng Yin
|
564a898ad9
|
Optimize mem indices mangement (#619)
|
2024-07-13 23:39:37 -07:00 |
|
Ying Sheng
|
5949b1ca0e
|
Fix memory pool index error (#616)
|
2024-07-13 16:45:11 -07:00 |
|
Lianmin Zheng
|
0feca02dd9
|
Improve benchmark scripts (#615)
|
2024-07-13 15:59:04 -07:00 |
|
Liangsheng Yin
|
10143e1a5f
|
Memorypool chunked prefetch (#614)
|
2024-07-13 15:24:03 -07:00 |
|
Lianmin Zheng
|
665815969a
|
Enable cuda graph by default (#612)
|
2024-07-13 05:29:46 -07:00 |
|
Liangsheng Yin
|
19818b9c2f
|
Minor: style improvement of radix_cache and memory_pool (#395)
|
2024-04-26 01:01:36 +08:00 |
|
Lianmin Zheng
|
c51020cf0c
|
Fix the chat template for llava-v1.6-34b & format code (#177)
|
2024-02-11 05:50:13 -08:00 |
|
Lianmin Zheng
|
22085081bb
|
release initial code
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: Liangsheng Yin <hnyls2002@gmail.com>
Co-authored-by: Zhiqiang Xie <xiezhq@stanford.edu>
Co-authored-by: parasol-aser <3848358+parasol-aser@users.noreply.github.com>
Co-authored-by: LiviaSun <33578456+ChuyueSun@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
|
2024-01-08 04:37:50 +00:00 |
|