Commit Graph

6 Commits

Author SHA1 Message Date
Baizhou Zhang
8ad700f735 Cleaning codes for speculative attention mode (#10149) 2025-09-08 17:38:06 -07:00
Liangsheng Yin
2c2b19b18b [CI] fix ambiguous argument in testing hybrid attentions. (#10161) 2025-09-08 18:16:52 +08:00
cicirori
8c5930f08a Add speculator attention backend switch (#9981) 2025-09-07 21:44:36 -07:00
DevashishLal-CB
13705dae06 [Fix] Add speculative_draft_model_revision to server_args (#5255)
Signed-off-by: Devashish Lal <devashish@rivosinc.com>
2025-09-05 19:45:46 +08:00
Qiaolin Yu
4a4772ae03 Support speculative decoding in hybrid attention backend (#9573) 2025-08-28 01:11:42 -07:00
Qiaolin Yu
2810338401 [feat] Support different attention backends for prefill and decode (#6338)
Co-authored-by: tianqilin.99 <tianqilin.99@bytedance.com>
Co-authored-by: Baizhou Zhang <sobereddiezhang@gmail.com>
2025-07-28 11:42:29 +08:00