Faraz
|
4b04998d38
|
TRTLLM Gen MLA Decode Kernel Integration (same as #7938) (#8632)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
|
2025-07-31 16:03:40 -07:00 |
|
Lianmin Zheng
|
9c7a46180c
|
[Doc] Steps to add a new attention backend (#8155)
|
2025-07-18 16:38:26 -07:00 |
|
ronnie_zheng
|
1e0e549766
|
Ascend attention backend(PA&MLA) (#7722)
Co-authored-by: Maksim <makcum888e@mail.ru>
Co-authored-by: VDV1985 <vladdv85@mail.ru>
|
2025-07-03 09:23:19 -07:00 |
|
Lianmin Zheng
|
21615cc3fe
|
Minor style and doc fix (#7228)
|
2025-06-16 01:03:13 -07:00 |
|
quinnrong94
|
2e4babdb0a
|
[Feat] Support FlashMLA backend with MTP and FP8 KV cache (#6109)
Co-authored-by: Yingyi <yingyihuang2000@outlook.com>
Co-authored-by: neiltian <neiltian@tencent.com>
Co-authored-by: lukec <118525388+sleepcoo@users.noreply.github.com>
Co-authored-by: kexueyu <kexueyu@tencent.com>
Co-authored-by: vincentmeng <vincentmeng@tencent.com>
Co-authored-by: pengmeng <pengmeng@tencent.com>
|
2025-05-15 00:48:09 -07:00 |
|
Didier Durand
|
92d1561b70
|
Update attention_backend.md: plural form (#5489)
|
2025-04-17 01:42:40 -07:00 |
|
mRSun15
|
3efc8e2d2a
|
add attention backend supporting matrix in the doc (#5211)
Co-authored-by: Stefan He <hebiaobuaa@gmail.com>
|
2025-04-15 17:16:34 -07:00 |
|