xc-llm-kunlun/quantization at main - xc-llm-kunlun - Gitea: Git with a cup of tea

EngineX/xc-llm-kunlun

Files

History

Xinyu Dong bf9369f733 Migrate XTorch operations to Kunlun operations (accelerating iteration) (#177 )

Signed-off-by: dongxinyu03 <dongxinyu03@baidu.com>

2026-02-12 18:13:00 +08:00

..

compressed_tensors

Migrate XTorch operations to Kunlun operations (accelerating iteration) (#177 )

2026-02-12 18:13:00 +08:00

[Feature] support compressed-tensors w4a16 quantization (#154 )

2026-01-27 19:56:22 +08:00

__init__.py

Initial commit for vLLM-Kunlun Plugin

2025-12-10 12:05:39 +08:00

awq.py

[Feature] Support AWQ MoE W4A16 Quantization (#142 )

2026-01-26 18:56:05 +08:00

gptq.py

[refactor]update Kunlun classes with monkey patch (#122 )

2026-01-19 20:24:19 +08:00

moe_wna16.py

[Feature] Support AWQ MoE W4A16 Quantization (#142 )

2026-01-26 18:56:05 +08:00