Commit Graph

7 Commits

Author SHA1 Message Date
Lianmin Zheng
621e96bf9b [CI] Fix ci tests (#5769) 2025-04-27 07:18:10 -07:00
fzyzcjy
cd7e32e2cb Optimize attention in llama4 (#5127) 2025-04-11 00:32:41 -07:00
Mick
fbebcb7aa4 model: support mllama4 (#5144) 2025-04-09 09:28:44 -07:00
HandH1998
4065248214 Support Llama4 fp8 inference (#5194)
Co-authored-by: laixinn <xielx@shanghaitech.edu.cn>
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
Co-authored-by: zhyncs <me@zhyncs.com>
2025-04-09 20:14:34 +08:00
fzyzcjy
86a876d883 Optimize topk operation in llama4 (#5128) 2025-04-09 02:50:22 -07:00
fzyzcjy
5039d54772 Support 2x8xH100 for Llama 4 (#5159) 2025-04-08 14:55:14 -07:00
Chang Su
f04c80dc42 Add Llama4 support (#5092)
Co-authored-by: Cheng Wan <cwan39@gatech.edu>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: ispobock <ispobaoke@163.com>
2025-04-07 00:29:36 -07:00