Logo
Explore Help
Register Sign In
EngineX/xc-llm-kunlun
3
0
Fork 0
You've already forked xc-llm-kunlun
Code Issues Pull Requests Projects Releases Wiki Activity
Files
f2019b145f0ca50153640486b4d9ed8cb9b7a87b
xc-llm-kunlun/vllm_kunlun/ops/attention
History
baoqian426 2512259944 longcontext chunk make attention crash, fix it (#117)
Co-authored-by: root <root@rdtest-node1150.bcc-zwlt.baidu.com>
2026-01-17 18:38:23 +08:00
..
backends
提交vllm0.11.0开发分支
2025-12-10 17:51:24 +08:00
__init__.py
Initial commit for vLLM-Kunlun Plugin
2025-12-10 12:05:39 +08:00
flashmla.py
[Misc]Specify that DS32 only supports --kv-cache-dtype bfloat16 (#119)
2026-01-17 16:52:02 +08:00
layer.py
提交vllm0.11.0开发分支
2025-12-10 17:51:24 +08:00
merge_attn_states.py
longcontext chunk make attention crash, fix it (#117)
2026-01-17 18:38:23 +08:00
mla.py
[Feature] support deepseek v3/r1/v3.2 (#78)
2026-01-05 22:55:35 +08:00
Powered by Gitea Version: 1.24.3 Page: 2463ms Template: 8ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API