Yineng Zhang
|
fad315cb8e
|
fix EAGLE 2 non greedy case (#3407)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
|
2025-02-09 07:28:34 +08:00 |
|
Yineng Zhang
|
d39899e85c
|
upgrade flashinfer v0.2.0.post2 (#3288)
Co-authored-by: pankajroark <pankajroark@users.noreply.github.com>
|
2025-02-04 21:41:40 +08:00 |
|
Yineng Zhang
|
013021b6a1
|
refactor EAGLE 2 (#3269)
Co-authored-by: Ying Sheng <sqy1415@gmail.com>
Co-authored-by: merrymercy <lianminzheng@gmail.com>
Co-authored-by: Ying1123 <sqy1415@gmail.com>
|
2025-02-03 20:52:30 +08:00 |
|
Lianmin Zheng
|
287d07a669
|
Misc fixes for eagle (flush_cache, CPU overhead) (#3014)
|
2025-01-20 20:27:38 -08:00 |
|
justdoit
|
4093aa4660
|
[Fix]eagle2 health_generate is first request,apiserver will core (#2853)
|
2025-01-13 01:01:21 -08:00 |
|
justdoit
|
a47bf39123
|
[Eagle2] Fix multiple concurrent request crashes (#2730)
|
2025-01-10 14:00:43 -08:00 |
|
Lianmin Zheng
|
b8574f6953
|
Clean up eagle code (#2756)
|
2025-01-06 14:54:18 -08:00 |
|
yukavio
|
815dce0554
|
Eagle speculative decoding part 4: Add EAGLE2 worker (#2150)
Co-authored-by: kavioyu <kavioyu@tencent.com>
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
|
2025-01-02 03:22:34 -08:00 |
|