Commit Graph

7 Commits

Author SHA1 Message Date
bjmsong
e21026690d benchmark decoding attention kernel with cudnn (#2467)
Co-authored-by: root <bjmsong@126.com>
2024-12-17 03:31:57 -08:00
Xiaoyu Zhang
a0592c059f [Benchmark] add a benchmark for hf/vllm/sglang rmsnorm (#2486) 2024-12-15 13:52:08 +08:00
bjmsong
f67723940d decoding attention kernel benchmark (#2425)
Co-authored-by: root <bjmsong@126.com>
2024-12-11 04:46:59 -08:00
Xiaoyu Zhang
3844feb9bb Add a unittest for fused_moe (#2416) 2024-12-08 22:46:10 -08:00
Lianmin Zheng
07ec07ad1f Improve torch compile for fused moe (#2327) 2024-12-03 01:58:25 -08:00
Lianmin Zheng
33deca81b5 Add more fused moe benchmark utilities (#2314) 2024-12-02 04:26:55 -08:00
Xiaoyu Zhang
262e370f78 [benchmark] Add fused_moe_triton benchmark and tuning tools (#2225)
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: HAI <hixiao@gmail.com>
2024-11-29 13:36:45 -08:00