Georgi Gerganov
29ae62d2ae
llama : fix embeddings (#5796)
* llama : fix embeddings
ggml-ci
* llama : do not use KV cache for non-causal models
ggml-ci
* embeddings : fix llama_batch_init arg
* llama : add pooling switch
* llama : distinguish token vs sequence embeddings
ggml-ci
* llama : assert pooling tensor
* llama : simplify causal mask condition
ggml-ci
* llama : assert input batch with pooling enabled
* readme : update API changes list
2024-03-04 22:31:20 +02:00
..
2023-11-07 00:36:23 +03:00
2023-11-02 08:50:16 +02:00
2024-03-04 20:26:55 +02:00
2024-03-04 22:31:20 +02:00
2024-03-04 20:24:00 +02:00
2023-09-15 15:38:27 -04:00
2023-08-21 23:07:43 +03:00
2023-12-04 09:57:35 +02:00
2023-08-21 23:07:43 +03:00
2023-12-12 11:53:36 +02:00
2024-03-04 20:24:00 +02:00
2024-03-04 20:24:00 +02:00
2023-10-12 18:23:18 +03:00
2024-02-25 12:09:09 +02:00
2023-11-13 14:16:23 +02:00