xc-llm-kunlun/mla at 2a998286c05ed12d0d82a5c8fea1ec136a8efe01 - xc-llm-kunlun - Gitea: Git with a cup of tea

EngineX/xc-llm-kunlun

Files

History

fromck 74d4f804e8 add 2 kernels and optimize the calculation of topk_indices (#134 )

Co-authored-by: chengxiaokang <chengxiaokang@baidu.com>

2026-01-22 10:29:28 +08:00

..

__init__.py

[Feature] support deepseek v3/r1/v3.2 (#78 )

2026-01-05 22:55:35 +08:00

common.py

longcontext chunk make attention crash, fix it (#117 )

2026-01-17 18:38:23 +08:00

flashmla_sparse.py

[Misc]Specify that DS32 only supports --kv-cache-dtype bfloat16 (#119 )

2026-01-17 16:52:02 +08:00

flashmla.py

enable full cudagraph for deepseek

2026-01-12 15:18:12 +08:00

indexer.py

add 2 kernels and optimize the calculation of topk_indices (#134 )

2026-01-22 10:29:28 +08:00