Enrique Shockwave
|
4c7640079c
|
check marlin format before attempting conversion (#4675)
|
2025-04-20 17:47:09 -07:00 |
|
Lianmin Zheng
|
74e0ac1dbd
|
Clean up import vllm in quantization/__init__.py (#4834)
|
2025-03-28 10:34:10 -07:00 |
|
Stefan He
|
4c584fc632
|
Fix circular imports in gptq.py and unblock test explorer (#4736)
|
2025-03-24 18:07:08 -07:00 |
|
Xiaoyu Zhang
|
dd865befde
|
[Hotfix] solve fp8 w8a8 ci test fail (#4531)
|
2025-03-17 23:17:04 -07:00 |
|
Xiaoyu Zhang
|
9b81f9bd34
|
sglang quant module remove vllm dependency (#4507)
|
2025-03-17 15:51:59 -07:00 |
|
Lianmin Zheng
|
45de89719c
|
Revert "[XPU][CPU] Enable the native path of DeepSeek" (#4367)
|
2025-03-12 23:45:52 -07:00 |
|
Meng, Hengyu
|
71046fcd71
|
[XPU][CPU] Enable the native path of DeepSeek (#4086)
Co-authored-by: Zhang, Liangang <liangang.zhang@intel.com>
|
2025-03-12 22:26:29 -07:00 |
|
Qubitium-ModelCloud
|
56a724eba3
|
[QUANT] Add GPTQModel Dynamic Quantization + lm_head Quantization (#3790)
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
|
2025-03-05 01:11:00 -08:00 |
|