This website requires JavaScript.
Explore
Help
Register
Sign In
EngineX
/
xc-llm-kunlun
Watch
3
Star
0
Fork
0
You've already forked xc-llm-kunlun
Code
Issues
Pull Requests
Actions
Projects
Releases
Wiki
Activity
Files
0711c1abfa52369d64202ad81fc7a70be16beb8b
xc-llm-kunlun
/
vllm_kunlun
/
ops
/
quantization
History
Shiwen Tang
0711c1abfa
[Feature] Support AWQ MoE W4A16 Quantization (
#142
)
...
Signed-off-by: tangshiwen <
tangshiwen@baidu.com
> Co-authored-by: Li Wei <
liwei.109@outlook.com
>
2026-01-26 18:56:05 +08:00
..
compressed_tensors
add 2 kernels and optimize the calculation of topk_indices (
#134
)
2026-01-22 10:29:28 +08:00
kernels
[Feature] Support AWQ MoE W4A16 Quantization (
#142
)
2026-01-26 18:56:05 +08:00
__init__.py
Initial commit for vLLM-Kunlun Plugin
2025-12-10 12:05:39 +08:00
awq.py
[Feature] Support AWQ MoE W4A16 Quantization (
#142
)
2026-01-26 18:56:05 +08:00
gptq.py
[refactor]update Kunlun classes with monkey patch (
#122
)
2026-01-19 20:24:19 +08:00
moe_wna16.py
[Feature] Support AWQ MoE W4A16 Quantization (
#142
)
2026-01-26 18:56:05 +08:00