Georgi Gerganov
574406dc7e
ggml : add Q5_0 and Q5_1 quantization (#1187)
* ggml : add Q5_0 quantization (cuBLAS only)
* ggml : fix Q5_0 qh -> uint32_t
* ggml : fix q5_0 histogram stats
* ggml : q5_0 scalar dot product
* ggml : q5_0 ARM NEON dot
* ggml : q5_0 more efficient ARM NEON using uint64_t masks
* ggml : rename Q5_0 -> Q5_1
* ggml : adding Q5_0 mode
* quantize : add Q5_0 and Q5_1 to map
* ggml : AVX2 optimizations for Q5_0, Q5_1 (#1195)
---------
Co-authored-by: Stephan Walter <stephan@walter.name>
2023-04-26 23:14:13 +03:00
..
2023-04-15 08:51:54 +03:00
2023-04-16 10:13:00 +00:00
2023-04-24 15:45:32 +00:00
2023-04-21 14:57:57 +02:00
2023-04-26 23:14:13 +03:00
2023-04-20 20:42:27 +03:00
2023-04-24 19:23:31 +03:00
2023-04-22 09:54:33 +03:00
2023-03-29 20:21:09 +03:00
2023-03-25 20:37:09 +02:00
2023-03-25 21:51:41 +02:00
2023-04-24 19:23:31 +03:00
2023-04-24 15:45:32 +00:00
2023-04-24 15:45:32 +00:00
2023-04-13 16:03:39 +03:00
2023-04-11 19:45:44 +00:00
2023-03-29 10:10:24 -05:00