xc-llm-kunlun/ops at c9f00c132c821e9264ca3efa23796ea3f08104ca - xc-llm-kunlun - Gitea: Git with a cup of tea

EngineX/xc-llm-kunlun

Files

History

Li Wei 2a2d773ad0 [fix]bias bug in kunlun_scale_mm (#126 )

2026-01-20 13:24:52 +08:00

..

longcontext chunk make attention crash, fix it (#117 )

2026-01-17 18:38:23 +08:00

[Kernel] Qwen3-next 优化 recompute_w_u_fwd & chunk_fwd_o (#74 )

2026-01-05 10:24:51 +08:00

[refactor]update Kunlun classes with monkey patch (#122 )

2026-01-19 20:24:19 +08:00

[Kernel] Optimize the performance of causal_conv1d.

2025-12-12 17:22:35 +08:00

[fix]bias bug in kunlun_scale_mm (#126 )

2026-01-20 13:24:52 +08:00

提交vllm0.11.0开发分支

2025-12-10 17:51:24 +08:00

__init__.py

[fix]update compressed-tensors scheme

2026-01-06 22:30:27 +08:00

_kunlun_ops.py

[Feature] support deepseek v3/r1/v3.2 (#78 )

2026-01-05 22:55:35 +08:00

activation.py

[Feature] support deepseek v3/r1/v3.2 (#78 )

2026-01-05 22:55:35 +08:00

deep_gemm.py

[Misc]Specify that DS32 only supports --kv-cache-dtype bfloat16 (#119 )

2026-01-17 16:52:02 +08:00

layernorm.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

linear.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

paged_attn.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00

rotary_embedding.py

[Feature] support deepseek v3/r1/v3.2 (#78 )

2026-01-05 22:55:35 +08:00

vocab_parallel_embedding.py

[Feature] Support XiaoMi MIMO Flash V2 (#62 )

2025-12-31 10:16:33 +08:00