Commit Graph

4 Commits

Author SHA1 Message Date
Liangsheng Yin
70b6802982 Optimize conflicts between CUDA graph and vocab mask tensors (#1392) 2024-09-13 20:27:53 -07:00
Lianmin Zheng
12cb115d38 Fix llama2 weight loader (#1317) 2024-09-03 05:32:14 -07:00
Lianmin Zheng
f64eae3a29 [Fix] Reduce memory usage for loading llava model & Remove EntryClassRemapping (#1308) 2024-09-02 21:44:45 -07:00
김종곤
b7f8341014 EXAONE 3.0 Model Support (#1258)
Co-authored-by: Yineng Zhang <me@zhyncs.com>
2024-08-30 08:08:28 +00:00