Lianmin Zheng
|
621e96bf9b
|
[CI] Fix ci tests (#5769)
|
2025-04-27 07:18:10 -07:00 |
|
fzyzcjy
|
cd7e32e2cb
|
Optimize attention in llama4 (#5127)
|
2025-04-11 00:32:41 -07:00 |
|
Mick
|
fbebcb7aa4
|
model: support mllama4 (#5144)
|
2025-04-09 09:28:44 -07:00 |
|
HandH1998
|
4065248214
|
Support Llama4 fp8 inference (#5194)
Co-authored-by: laixinn <xielx@shanghaitech.edu.cn>
Co-authored-by: sleepcoo <sleepcoo@gmail.com>
Co-authored-by: zhyncs <me@zhyncs.com>
|
2025-04-09 20:14:34 +08:00 |
|
fzyzcjy
|
86a876d883
|
Optimize topk operation in llama4 (#5128)
|
2025-04-09 02:50:22 -07:00 |
|
fzyzcjy
|
5039d54772
|
Support 2x8xH100 for Llama 4 (#5159)
|
2025-04-08 14:55:14 -07:00 |
|
Chang Su
|
f04c80dc42
|
Add Llama4 support (#5092)
Co-authored-by: Cheng Wan <cwan39@gatech.edu>
Co-authored-by: fzyzcjy <ch271828n@outlook.com>
Co-authored-by: ispobock <ispobaoke@163.com>
|
2025-04-07 00:29:36 -07:00 |
|