This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX
/
xc-llm-kunlun
Watch
3
Star
0
Fork
0
You've already forked xc-llm-kunlun
Code
Issues
Pull Requests
Actions
Projects
Releases
Wiki
Activity
Files
main
xc-llm-kunlun
/
vllm_kunlun
/
ops
/
quantization
History
Xinyu Dong
bf9369f733
Migrate XTorch operations to Kunlun operations (accelerating iteration) (
#177
)
...
Signed-off-by: dongxinyu03 <
dongxinyu03@baidu.com
>
2026-02-12 18:13:00 +08:00
..
compressed_tensors
Migrate XTorch operations to Kunlun operations (accelerating iteration) (
#177
)
2026-02-12 18:13:00 +08:00
kernels
[Feature] support compressed-tensors w4a16 quantization (
#154
)
2026-01-27 19:56:22 +08:00
__init__.py
Initial commit for vLLM-Kunlun Plugin
2025-12-10 12:05:39 +08:00
awq.py
[Feature] Support AWQ MoE W4A16 Quantization (
#142
)
2026-01-26 18:56:05 +08:00
gptq.py
[refactor]update Kunlun classes with monkey patch (
#122
)
2026-01-19 20:24:19 +08:00
moe_wna16.py
[Feature] Support AWQ MoE W4A16 Quantization (
#142
)
2026-01-26 18:56:05 +08:00