Commit Graph

6 Commits

Author SHA1 Message Date
Lianmin Zheng
9463bc1385 Enable torch.compile for triton backend (#1422) 2024-09-14 15:38:37 -07:00
Jerry Zhang
a7c47e0f02 Add torchao quant (int4/int8/fp8) to llama models (#1341)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2024-09-09 05:32:41 -07:00
Lianmin Zheng
e4d68afcf0 [Minor] Many cleanup (#1357) 2024-09-09 04:14:11 -07:00
Lianmin Zheng
843e63d809 Fix the flaky test test_moe_eval_accuracy_large.py (#1326) 2024-09-04 04:15:11 -07:00
Lianmin Zheng
58fa607622 Fix the flaky tests in test_moe_eval_accuracy_large.py (#1293) 2024-09-01 12:20:46 -07:00
Lianmin Zheng
1b5d56f7f8 [CI] Add more multi-gpu tests (#1280) 2024-09-01 00:27:25 -07:00