Meng, Hengyu
|
b113c72e7a
|
Init attention backend for Intel XPU (#10656)
Co-authored-by: guangyey <guangye.yu@intel.com>
Co-authored-by: DiweiSun <105627594+DiweiSun@users.noreply.github.com>
|
2025-10-21 11:41:28 +08:00 |
|
Baizhou Zhang
|
44f0ece9fc
|
[Doc] Update documents for FA4 (#11778)
|
2025-10-19 17:40:38 -07:00 |
|
b8zhong
|
6bc503af73
|
[Doc] Update support matrix for attn and hybrid attn (#11293)
|
2025-10-14 22:43:11 -07:00 |
|
Philip Kiely - Baseten
|
7f028b07c4
|
Fix formatting in long code blocks (#10528)
|
2025-09-16 12:02:05 -07:00 |
|
jacky.cheng
|
25caa7a8a9
|
[AMD] Support Wave attention backend with AMD GPU optimizations (#8660)
Signed-off-by: Stanley Winata <stanley.winata@amd.com>
Signed-off-by: Harsh Menon <harsh@nod-labs.com>
Signed-off-by: nithinsubbiah <nithinsubbiah@gmail.com>
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
Signed-off-by: xintin <gaurav.verma@amd.com>
Co-authored-by: Harsh Menon <harsh@nod-labs.com>
Co-authored-by: Stanley Winata <stanley.winata@amd.com>
Co-authored-by: Stanley Winata <68087699+raikonenfnu@users.noreply.github.com>
Co-authored-by: Stanley Winata <stanley@nod-labs.com>
Co-authored-by: Ivan Butygin <ivan.butygin@gmail.com>
Co-authored-by: nithinsubbiah <nithinsubbiah@gmail.com>
Co-authored-by: Nithin Meganathan <18070964+nithinsubbiah@users.noreply.github.com>
Co-authored-by: Ivan Butygin <ibutygin@amd.com>
|
2025-08-12 13:49:11 -07:00 |
|
Faraz
|
f508cd3cb7
|
TRTLLM-MLA FP8 path (#8638)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
|
2025-08-11 14:02:13 -07:00 |
|
Lianmin Zheng
|
2449a0afe2
|
Refactor the docs (#9031)
|
2025-08-10 19:49:45 -07:00 |
|