applesaucethebun
|
d738ab52f8
|
fix some typos (#6209)
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-05-13 01:42:38 +08:00 |
|
Yineng Zhang
|
04f2abcb34
|
fix: gemma 3 not use softcap (#5622)
|
2025-04-22 01:16:08 -07:00 |
|
Yun Dai
|
2695ab0537
|
Fix loading KV quantization scale; Enable modelopt kv cache (#4686)
Co-authored-by: qingquansong <ustcsqq@gmail.com>
|
2025-04-08 09:11:35 -07:00 |
|
Juwan Yoo
|
0bc0bf5734
|
gemma3: impl get_attention_sliding_window_size for attn init (#4823)
|
2025-03-27 10:43:58 -07:00 |
|
Mick
|
11577cedb7
|
refactor: bug fixes and refactor for vlm (#4661)
|
2025-03-22 22:48:49 -07:00 |
|
Adarsh Shirawalmath
|
f8f9244a61
|
[Bug Fix] Add partial rotary factor support for Phi-4 and upgrade to transformers v4.50.0 (#3984)
Co-authored-by: Chayenne <zhaochen20@outlook.com>
|
2025-03-22 14:27:39 -07:00 |
|
Mick
|
9d02bb3e2a
|
Urgent model support: support gemma-3-it (#4424)
|
2025-03-16 17:37:32 -07:00 |
|