Yun Dai
|
2695ab0537
|
Fix loading KV quantization scale; Enable modelopt kv cache (#4686)
Co-authored-by: qingquansong <ustcsqq@gmail.com>
|
2025-04-08 09:11:35 -07:00 |
|
Qubitium-ModelCloud
|
56a724eba3
|
[QUANT] Add GPTQModel Dynamic Quantization + lm_head Quantization (#3790)
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
|
2025-03-05 01:11:00 -08:00 |
|
fzyzcjy
|
d37f95511d
|
Improve: Tiny fix Olmo2 (#3348)
|
2025-02-21 16:09:35 -08:00 |
|
Yineng Zhang
|
5a176c92df
|
fix deepseek v2 with cpu device (#2975)
|
2025-01-19 21:33:27 +08:00 |
|
Yineng Zhang
|
2add697d7a
|
feat: remove vllm get_rope (#2964)
|
2025-01-18 19:38:01 +08:00 |
|
Yineng Zhang
|
033c715b46
|
cleanup models dependencies 1/n (#2948)
|
2025-01-17 23:46:48 +08:00 |
|
Yineng Zhang
|
5dc54f1a62
|
feat: remove vllm distributed (#2907)
Co-authored-by: Zhangyi <1109276519@qq.com>
|
2025-01-17 22:31:51 +08:00 |
|
Yineng Zhang
|
85e1a6f3aa
|
Update model_loader deps and qqq quantization deps (#2220) (#2318)
Co-authored-by: HandH1998 <1335248067@qq.com>
|
2024-12-02 23:22:13 +08:00 |
|
Jani Monoses
|
db674e3d24
|
Add OLMo2 model. (#2233)
|
2024-11-28 00:15:20 -08:00 |
|