This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX
/
xc-llm-kunlun
Watch
3
Star
0
Fork
0
You've already forked xc-llm-kunlun
Code
Issues
Pull Requests
Projects
Releases
Wiki
Activity
Files
0ce5f1a3f7294ea42ab5649a8408f9cae9e5e5a5
xc-llm-kunlun
/
vllm_kunlun
/
ops
/
attention
History
fromck
0ce5f1a3f7
Add kernels to optimize RoPE and the decoding stage (
#143
)
...
Co-authored-by: chengxiaokang <
chengxiaokang@baidu.com
>
2026-01-23 10:29:52 +08:00
..
backends
提交vllm0.11.0开发分支
2025-12-10 17:51:24 +08:00
__init__.py
Initial commit for vLLM-Kunlun Plugin
2025-12-10 12:05:39 +08:00
flashmla.py
Add kernels to optimize RoPE and the decoding stage (
#143
)
2026-01-23 10:29:52 +08:00
layer.py
提交vllm0.11.0开发分支
2025-12-10 17:51:24 +08:00
merge_attn_states.py
longcontext chunk make attention crash, fix it (
#117
)
2026-01-17 18:38:23 +08:00
mla.py
[Feature] support deepseek v3/r1/v3.2 (
#78
)
2026-01-05 22:55:35 +08:00