xc-llm-kunlun/quantization at 74d4f804e8f09ade4eca55dcf19db48ed8394be2 - xc-llm-kunlun - Gitea: Git with a cup of tea

EngineX/xc-llm-kunlun

Files

History

fromck 74d4f804e8 add 2 kernels and optimize the calculation of topk_indices (#134 )

Co-authored-by: chengxiaokang <chengxiaokang@baidu.com>

2026-01-22 10:29:28 +08:00

..

compressed_tensors

add 2 kernels and optimize the calculation of topk_indices (#134 )

2026-01-22 10:29:28 +08:00

[fix]bias bug in kunlun_scale_mm (#126 )

2026-01-20 13:24:52 +08:00

__init__.py

Initial commit for vLLM-Kunlun Plugin

2025-12-10 12:05:39 +08:00

awq.py

[refactor]update Kunlun classes with monkey patch (#122 )

2026-01-19 20:24:19 +08:00

gptq.py

[refactor]update Kunlun classes with monkey patch (#122 )

2026-01-19 20:24:19 +08:00