Commit Graph

8 Commits

Author SHA1 Message Date
Yineng Zhang
fad315cb8e fix EAGLE 2 non greedy case (#3407)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
2025-02-09 07:28:34 +08:00
Yineng Zhang
d39899e85c upgrade flashinfer v0.2.0.post2 (#3288)
Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>
2025-02-04 21:41:40 +08:00
Yineng Zhang
013021b6a1 refactor EAGLE 2 (#3269)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: merrymercy <lianminzheng@gmail.com>
Co-authored-by: Ying1123 <sqy1415@gmail.com>
2025-02-03 20:52:30 +08:00
Lianmin Zheng
287d07a669 Misc fixes for eagle (flush_cache, CPU overhead) (#3014) 2025-01-20 20:27:38 -08:00
justdoit
4093aa4660 [Fix]eagle2 health_generate is first request,apiserver will core (#2853) 2025-01-13 01:01:21 -08:00
justdoit
a47bf39123 [Eagle2] Fix multiple concurrent request crashes (#2730) 2025-01-10 14:00:43 -08:00
Lianmin Zheng
b8574f6953 Clean up eagle code (#2756) 2025-01-06 14:54:18 -08:00
yukavio
815dce0554 Eagle speculative decoding part 4: Add EAGLE2 worker (#2150)
Co-authored-by: kavioyu <kavioyu@tencent.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
2025-01-02 03:22:34 -08:00