This website requires JavaScript.
248367605e
Work around for recalculating logits in cached prompts (Fixes #1585 ) (#1609 )
DannyDaemonic
2023-05-29 05:13:40 -07:00
0e730dd23b
Adding git in container package dependencies (#1621 )
Jiří Podivín
2023-05-29 06:45:50 +02:00
3b126f654f
LLAMA_DEBUG adds debug symbols (#1617 )
Johannes Gäßler
2023-05-28 21:01:02 +02:00
1b78ed2081
Only show -ngl option when relevant + other doc/arg handling updates (#1625 )
Kerfuffle
2023-05-28 11:48:57 -06:00
337aea1139
examples : add --alias option to gpt_params to set use friendly model name (#1614 )
Vladimir Zorin
2023-05-28 20:14:24 +03:00
bb051d9723
opencl : no need to allocate cl_mem on heap (#1612 )
Howard Su
2023-05-29 01:13:36 +08:00
ca74884f66
opencl : use strstr to check if fp16 supported (#1611 )
Howard Su
2023-05-29 01:09:56 +08:00
a6704643b6
ggml : add support for the RISCV architecture (#1616 )
apcameron
2023-05-27 21:03:25 +01:00
0df7d63e5b
Include server in releases + other build system cleanups (#1610 )
Kerfuffle
2023-05-27 11:04:14 -06:00
97c9b77c4f
Add documentation about CLBlast (#1604 )
Henri Vasserman
2023-05-27 18:47:55 +03:00
0ecb1bbbeb
[CI] Fix openblas (#1613 )
Henri Vasserman
2023-05-27 17:24:06 +03:00
93618031c7
ggml : add ggml_tensor_overhead()
Georgi Gerganov
2023-05-27 16:19:56 +03:00
83c54e6da5
[CI] CLBlast: Fix directory name (#1606 )
Henri Vasserman
2023-05-27 15:18:25 +03:00
bdbda1b17a
ggml : sync ggml core (minor additions, e.g. ggml_get_tensor_by_name())
Georgi Gerganov
2023-05-27 12:22:05 +03:00
66874d4fbc
Some improvements to loading the session with --prompt-cache (#1550 )
Kerfuffle
2023-05-25 20:18:01 -06:00
1fcdcc28b1
cuda : performance optimizations (#1530 )
Johannes Gäßler
2023-05-25 23:07:29 +02:00
ac7876ac20
Update CLBlast to 1.6.0 (#1580 )
Henri Vasserman
2023-05-24 10:30:09 +03:00
c31bbe934b
readme : add docs for chat-persistent.sh (#1568 )
Evan Jones
2023-05-24 02:24:01 -04:00
1359b6aba5
chat-persistent.sh : use bracket expressions in grep (#1564 )
Senemu
2023-05-24 06:16:22 +00:00
7d873811f3
Fix handling of "invalid property" when creating OpenCL command queue (#1565 )
Maarten ter Huurne
2023-05-23 18:01:15 +02:00
2e6cd4b025
OpenCL Token Generation Acceleration (#1459 )
0cc4m
2023-05-22 23:33:24 +02:00
7e4ea5beff
examples : add server example with REST API (#1443 )
Steward Garcia
2023-05-21 11:51:18 -06:00
7780e4f479
make : .PHONY clean (#1553 )
Stefan Sydow
2023-05-21 16:03:44 +02:00
265db9834e
ggml : output 3d sizes in ggml_graph_dump_dot()
Georgi Gerganov
2023-05-21 11:56:23 +03:00
fab49c685e
ggml : update WASM SIMD
Georgi Gerganov
2023-05-20 20:00:41 +03:00
b8ee340abe
feature : support blis and other blas implementation (#1536 )
Zenix
2023-05-20 23:58:31 +09:00
9ecb30f959
OpenCL: Fixes for older devices. (#1435 )
Henri Vasserman
2023-05-20 17:57:39 +03:00
29cf5596fe
llama : define magic numbers as integer constants (#1518 ) (#1520 )
Juuso Alasuutari
2023-05-20 15:58:15 +03:00
3de84b2606
ggml : add ggml_clamp() (#1539 )
Georgi Gerganov
2023-05-20 15:34:45 +03:00
affc76edfd
cuda : loading models directly into VRAM, norm calculation on GPU, broadcasting for ggml_mul (#1483 )
Johannes Gäßler
2023-05-20 14:19:28 +02:00
ea600071cb
Revert "feature : add blis and other BLAS implementation support (#1502 )"
Georgi Gerganov
2023-05-20 12:03:48 +03:00
07e9ace0f9
feature : add blis and other BLAS implementation support (#1502 )
Zenix
2023-05-20 18:02:48 +09:00
ec2e10c444
llama : add llama_init_backend() API (close #1527 )
Georgi Gerganov
2023-05-20 11:06:11 +03:00
d2c59b8ba4
Fix for mingw (#1462 )
DannyDaemonic
2023-05-20 00:40:02 -07:00
503db28849
llama : fix name shadowing and C4146 (#1526 )
Maxime
2023-05-20 09:22:37 +02:00
8a203f9fa1
llama : fix compile warnings in llama_set_state_data()
Georgi Gerganov
2023-05-20 10:14:31 +03:00
4fd3e29297
ggml : fix scalar implementation of Q4_1 dot
Georgi Gerganov
2023-05-20 10:13:19 +03:00
2d5db48371
ggml : use F16 instead of F32 in Q4_0, Q4_1, Q8_0 (#1508 )
Georgi Gerganov
2023-05-19 22:17:18 +03:00
6986c7835a
tests : add missing header
Georgi Gerganov
2023-05-19 21:17:28 +03:00
943e6081cc
examples : add persistent chat (#1495 )
Evan Jones
2023-05-19 13:39:51 -04:00
7694b52b9a
main : make reverse prompt option act as a stop token in non-interactive mode (#1032 )
Jason McCartney
2023-05-19 10:24:59 -07:00
79e3efb0e9
readme : adds WizardLM to the list of supported models (#1485 )
David Kennedy
2023-05-19 13:16:30 -04:00
4b7e245adf
minor : fix compile warnings
Georgi Gerganov
2023-05-19 20:14:51 +03:00
5ea4339273
make kv_f16 the default for api users (#1517 )
Erik Scholz
2023-05-18 19:31:01 +02:00
ee9654138a
Fixes #1511 lambda issue for w64devkit (mingw) (#1513 )
DannyDaemonic
2023-05-18 10:30:40 -07:00
dc271c52ed
Remove unused n_parts parameter (#1509 )
Stephan Walter
2023-05-17 22:12:01 +00:00
c238b5873a
benchmark-matmul: Print the average of the test results (#1490 )
rankaiyx
2023-05-17 22:47:58 +08:00
2b2646931b
convert.py: Support models which are stored in a single pytorch_model.bin (#1469 )
Tom Jobbins
2023-05-16 23:04:35 +01:00
42627421ec
~7% faster Q5_1 AVX2 code (#1477 )
Ilya Kurdyukov
2023-05-17 01:36:47 +07:00
9560655409
define default model path once, sync path with readme (#1366 )
András Salamon
2023-05-16 16:46:34 +01:00
2a5ee023ad
Add alternate include path for openblas (#1476 )
sandyiscool
2023-05-16 14:00:15 +05:30
63d20469b8
fix get_num_physical_cores() (#1436 )
zrm
2023-05-14 22:25:42 -04:00
b5c9295eef
benchmark-matmul: fix clang-tidy issues, report results in GFLOPS (#1458 )
slaren
2023-05-14 22:46:00 +02:00
eb363627fd
cuda : deduplicated dequantization code (#1453 )
Johannes Gäßler
2023-05-14 20:53:23 +02:00
79b2d5b69d
ggml : alternative fix for race condition bug in non-inplace ggml_compute_forward_diag_mask_f32 (#1454 )
xaedes
2023-05-14 17:55:02 +02:00
13c351ad72
ggml : various fixes (#1450 )
Georgi Gerganov
2023-05-14 18:22:50 +03:00
60f8c361ca
ggml : add AVX support based on AVX2 code (#1430 )
katsu560
2023-05-14 19:03:51 +09:00
601a033475
ggml : add GGML_QNT_VERSION to track quantization format changes
Georgi Gerganov
2023-05-14 10:20:19 +03:00
08737ef720
cuda : fix convert function (#1412 )
Georgi Gerganov
2023-05-13 17:40:58 +03:00
bda4d7c215
make : fix PERF build with cuBLAS
Georgi Gerganov
2023-05-13 17:25:09 +03:00
5a5aeb1e91
llama : fix unused warning
Georgi Gerganov
2023-05-13 16:55:14 +03:00
66841fdb0e
ggml : multi-thread mul and diag_mask ops (#1428 )
Georgi Gerganov
2023-05-13 16:48:03 +03:00
905d87b70a
ggml : GPU-accelerated token generation (#1412 )
Johannes Gäßler
2023-05-13 15:38:36 +02:00
f954edda93
ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360 )
xaedes
2023-05-13 14:56:40 +02:00
f048af0230
ggml : sync alibi fix from ggml repo
Georgi Gerganov
2023-05-13 11:54:33 +03:00
ac0cd259d5
Adding SSE instructions to ggml_vec_dot_q4_0_q8_0 (#1413 )
3ooabkhxtn
2023-05-13 10:43:33 +02:00
0cd22e190a
llama : fix various warnings
Georgi Gerganov
2023-05-13 11:23:15 +03:00
6456a4eb9f
embedding : remove unused code (#1426 )
Rinne
2023-05-13 15:24:20 +08:00
cdd5350892
readme : update Q4_0 perplexities
Georgi Gerganov
2023-05-13 09:12:44 +03:00
738ace394a
llama : free ggml context in set / copy state data (close #1425 )
Georgi Gerganov
2023-05-13 09:08:52 +03:00
699b1ad7fe
opencl : fix kernels for the new formats (#1422 )
Henri Vasserman
2023-05-13 09:01:15 +03:00
fb62f92433
llama : fix --mtest option (close #1414 )
Georgi Gerganov
2023-05-12 21:44:20 +03:00
773ee249fb
CLI args use - instead of _, backwards compatible (#1416 )
Johannes Gäßler
2023-05-12 16:34:55 +02:00
553fd4d4b5
Add clang-tidy reviews to CI (#1407 )
slaren
2023-05-12 15:40:53 +02:00
089b1c93ba
readme : add C#/.NET bindings repo (#1409 )
Rinne
2023-05-12 13:39:40 +08:00
b9fd7eee57
ggml : remove bit shuffling (#1405 )
Georgi Gerganov
2023-05-12 00:23:08 +03:00
b608b55a3e
prompts : model agnostic DAN (#1304 )
CRD716
2023-05-11 10:10:19 -05:00
cf348a60e0
main : add option to save full output to session (#1338 )
Evan Jones
2023-05-10 11:37:14 -04:00
e6a46b0ed1
Locale fix for Windows (#1379 )
DannyDaemonic
2023-05-09 10:53:28 -07:00
9f8dbc4787
use pause asm insn in busyloop to run the CPU (13600K) 10 °C cooler (#1314 )
Sami Farin
2023-05-09 15:29:20 +03:00
41654efea8
Interface improvements and --multiline-input (previously --author-mode) (#1040 )
DannyDaemonic
2023-05-08 19:45:48 -07:00
56551bc11f
readme : add notice about upcoming breaking change
Georgi Gerganov
2023-05-08 22:52:18 +03:00
fe60904eef
readme : add TOC and Pygmalion instructions (#1359 )
AlpinDale
2023-05-08 21:03:30 +04:30
003ba2fb43
llama : fix hparams shadow (#1367 )
Pavol Rusnak
2023-05-08 16:48:21 +02:00
f9a6364912
llama : require first token to be BOS (#1303 )
Georgi Gerganov
2023-05-08 17:41:54 +03:00
95078cc554
convert: add ability to convert safetensors files (#1276 )
ubik2
2023-05-08 04:54:26 -07:00
1f48b0abcf
Documented CUDA reproducibility, added warning (#1346 )
Johannes Gäßler
2023-05-08 02:42:01 +02:00
e1295513a4
CI: add Windows CLBlast and OpenBLAS builds (#1277 )
Henri Vasserman
2023-05-07 14:20:09 +03:00
1b0fd45465
ggml : Allow usage of CLBlast alongside Accelerate.framework (#1336 )
swittk
2023-05-07 10:03:23 +07:00
3924088512
Remove default arguments from sampling functions (#1343 )
Jed Fox
2023-05-06 17:01:47 -04:00
173d0e6419
makefile: automatic Arch Linux detection (#1332 )
DaniAndTheWeb
2023-05-05 23:57:14 +02:00
a3b85b28da
ci : add cublas to windows release (#1271 )
Erik Scholz
2023-05-05 22:56:09 +02:00
921dcee00a
readme: add missing info (#1324 )
Pavol Rusnak
2023-05-05 16:43:36 +02:00
2d13786e91
Fix for OpenCL / clbast builds on macOS. (#1329 )
Ionoclast Laboratories
2023-05-05 08:18:21 -04:00
a90e96b266
Convert.py @staticmethod (#1327 )
Benjamin Lecaillon
2023-05-05 02:17:07 +02:00
94c5652fc0
quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301 )
slaren
2023-05-05 00:58:56 +02:00
34d9f22f44
Wrap exceptions in std::exception to verbose output on exception. (#1316 )
Ivan Stepanov
2023-05-04 19:56:27 +03:00
d3e8093e9b
convert: support DT_BF16 tensors (#1309 )
Ivan Stepanov
2023-05-04 19:54:37 +03:00
360cfe5bec
readme : add OpenBuddy link (#1321 )
44670
2023-05-05 00:33:31 +08:00
2edbdb0f99
main : add --in-suffix option (#1318 )
44670
2023-05-04 23:41:12 +08:00