Commit Graph

4 Commits

Author SHA1 Message Date
Qingquan Song
188f0955fa Add Speculative Decoding Eagle3 topk > 1 (#5318)
Co-authored-by: Stefan He <hebiaobuaa@gmail.com>
Co-authored-by: Yubo Wang <yubowang2019@gmail.com>
2025-04-20 22:58:28 -07:00
Baizhou Zhang
a42736bbb8 Support MHA with chunked prefix cache for DeepSeek chunked prefill (#5113) 2025-04-15 22:01:22 -07:00
Yubo Wang
fd5a55cfd3 Use public model for FA3 speculative decode testing (#5152) 2025-04-08 00:08:25 -07:00
Yubo Wang
804d9f2e4c Add unit test on page_size > 1 and mla and integration test for Flash Attention 3 (#4760) 2025-04-07 23:20:51 -07:00