Commit Graph

7 Commits

Author SHA1 Message Date
Lianmin Zheng
048685430d Improve process creation (#1534) 2024-09-29 02:36:12 -07:00
Lianmin Zheng
9463bc1385 Enable torch.compile for triton backend (#1422) 2024-09-14 15:38:37 -07:00
Lianmin Zheng
46094e0c1b Deprecate --disable-flashinfer and introduce --attention-backend (#1380) 2024-09-10 17:11:16 -07:00
Jerry Zhang
a7c47e0f02 Add torchao quant (int4/int8/fp8) to llama models (#1341)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-09-09 05:32:41 -07:00
Mingyi
158e8f1e2d improve the threshold and ports in tests (#1215) 2024-08-25 19:02:08 -07:00
Yineng Zhang
f7fb68d292 ci: add moe test (#1053) 2024-08-13 18:43:23 +10:00
Lianmin Zheng
c877292cc1 Re-organize CI tests (#1052) 2024-08-12 03:39:01 -07:00