Xinyu Dong
|
bf9369f733
|
Migrate XTorch operations to Kunlun operations (accelerating iteration) (#177)
Signed-off-by: dongxinyu03 <dongxinyu03@baidu.com>
|
2026-02-12 18:13:00 +08:00 |
|
Li Wei
|
71bd70ad6c
|
[Feature] support compressed-tensors w4a16 quantization (#154)
- native int4 kimi model inference is supported
Signed-off-by: Li Wei <liwei.109@outlook.com>
|
2026-01-27 19:56:22 +08:00 |
|
Shiwen Tang
|
0711c1abfa
|
[Feature] Support AWQ MoE W4A16 Quantization (#142)
Signed-off-by: tangshiwen <tangshiwen@baidu.com>
Co-authored-by: Li Wei <liwei.109@outlook.com>
|
2026-01-26 18:56:05 +08:00 |
|
fromck
|
74d4f804e8
|
add 2 kernels and optimize the calculation of topk_indices (#134)
Co-authored-by: chengxiaokang <chengxiaokang@baidu.com>
|
2026-01-22 10:29:28 +08:00 |
|
Li Wei
|
2a2d773ad0
|
[fix]bias bug in kunlun_scale_mm (#126)
|
2026-01-20 13:24:52 +08:00 |
|
Li Wei
|
8f56cbf3ed
|
[refactor]update Kunlun classes with monkey patch (#122)
Signed-off-by: Li Wei <liwei.109@outlook.com>
|
2026-01-19 20:24:19 +08:00 |
|
baoqian426
|
eb40e8a07a
|
[Bugfix] fix can not import compressed_tensors (#87)
Co-authored-by: root <root@rdtest-node1150.bcc-zwlt.baidu.com>
|
2026-01-07 11:32:10 +08:00 |
|
Li Wei
|
1c1b84d78c
|
[fix]update compressed-tensors scheme
Deepseek v3.2 is supported now
Signed-off-by: Li Wei <liwei.109@outlook.com>
|
2026-01-06 22:30:27 +08:00 |
|
Li Wei
|
515a4eeda9
|
[dev] support compressed-tensors w8a8 quantization (#75)
* [dev] support compressed-tensors w8a8 quantization
Co-authored-by: Li Wei <liwei.109@outlook.com>
* [refact]update KunlunScaleMMKernel impl
* [rebase]resolve conflicts and remove redundant code
---------
Co-authored-by: tangshiwen <tangshiwen@baidu.com>
|
2026-01-06 13:51:53 +08:00 |
|
baoqian426
|
ee0f50e68f
|
[Feature] support deepseek v3/r1/v3.2 (#78)
* [Feature] support deepseek v3/r1/v3.2
* fix gpt_oss
* update readme
* update readme
---------
Co-authored-by: hanhaowen <hanhaowen@baidu.com>
|
2026-01-05 22:55:35 +08:00 |
|
Li Wei
|
6546323c71
|
[dev] support AWQ/GPTQ quantization for dense models
|
2025-12-24 13:46:06 +08:00 |
|
chenyili
|
7c22d621fb
|
提交vllm0.11.0开发分支
|
2025-12-10 17:51:24 +08:00 |
|
dongxinyu03
|
c728e52505
|
Initial commit for vLLM-Kunlun Plugin
|
2025-12-10 12:05:39 +08:00 |
|