Georgi Gerganov
7a32fcb3b2
ggml : add Q8_0 quantization format (rename the old one to Q8_1) (ARM NEON) (#1179)
* ggml : add Q8_0 quantization format (rename the old one to Q8_1)
* tests : fix test-quantize-fns
* ggml : finalize Q8_0 implementation
* ggml : use q4_0_q8_0 and q4_2_q8_0
* ggml : fix Q8_0 dot product bug (ARM)
* ggml : Q8_0 unroll x2
* ggml : fix bug - using wrong block type
* ggml : extend quantize_fns_t with "vec_dot_type"
* ggml : fix Q8_0 to use 255 values out of 256
* ggml : fix assert using wrong QK4_2 instead of QK4_3
2023-04-25 23:40:51 +03:00
..
2023-04-15 08:51:54 +03:00
2023-04-16 10:13:00 +00:00
2023-04-24 15:45:32 +00:00
2023-04-21 14:57:57 +02:00
2023-04-25 23:40:51 +03:00
2023-04-20 20:42:27 +03:00
2023-04-24 19:23:31 +03:00
2023-04-22 09:54:33 +03:00
2023-03-29 20:21:09 +03:00
2023-03-25 20:37:09 +02:00
2023-03-25 21:51:41 +02:00
2023-04-24 19:23:31 +03:00
2023-04-24 15:45:32 +00:00
2023-04-24 15:45:32 +00:00
2023-04-13 16:03:39 +03:00
2023-04-11 19:45:44 +00:00
2023-03-29 10:10:24 -05:00