Commit Graph

4 Commits

Author SHA1 Message Date
Lianmin Zheng
1acccb364a Fix oom issues with fp8 for llama (#1454) 2024-09-18 03:45:19 -07:00
Lianmin Zheng
9463bc1385 Enable torch.compile for triton backend (#1422) 2024-09-14 15:38:37 -07:00
Lianmin Zheng
ad0ff62a4c Balance test in CI (#1411) 2024-09-12 23:29:44 -07:00
Lianmin Zheng
68be2f6d3b [CI] Include triton backend and online serving benchmark into CI (#1408) 2024-09-12 21:36:41 -07:00