Ebey Abraham
b9e74f9bca
llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490)
* phi2 implementation
* fix breaking change
* phi-2 : various fixes
* phi-2 : use layer norm eps
* py : whitespaces
* llama : fix meta KV override bug
* convert : phi don't add BOS token
* convert : revert "added_tokens_decoder" change
* phi-2 : scale Q instead of KQ for better precision
* ggml : fix NeoX rope to rotate just first n_dims
* cuda : less diff in the rope_neox kernel
* ggml : add ggml_mul_mat_set_prec
ggml-ci
* Update ggml-cuda.cu
Co-authored-by: slaren <slarengh@gmail.com>
* Update ggml-cuda.cu
Co-authored-by: slaren <slarengh@gmail.com>
* cuda : ggml_cuda_op_mul_mat_cublas support F32 precision
* cuda : remove oboslete comment
---------
Co-authored-by: Ebey Abraham <ebeyabraham@microsoft.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>
2023-12-18 19:27:47 +02:00
..
2023-12-07 22:26:54 +02:00
2023-12-18 19:27:47 +02:00
2023-08-30 09:20:26 +03:00
2023-10-30 19:19:15 +02:00
2023-12-12 11:53:36 +02:00
2023-08-21 23:07:43 +03:00
2023-08-21 23:07:43 +03:00
2023-11-13 14:16:23 +02:00
2023-10-30 19:19:15 +02:00
2023-12-14 20:05:21 +01:00
2023-09-28 19:04:36 +03:00
2023-10-20 21:07:23 +03:00
2023-10-10 18:59:52 +02:00
2023-11-20 11:35:47 +01:00
2023-10-10 18:59:52 +02:00
2023-11-20 11:35:47 +01:00
2023-10-24 09:17:17 +02:00
2023-10-03 09:16:26 +02:00