2024-12-02 08:53:27 +02:00
|
|
|
# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs
|
2025-09-22 18:20:21 +03:00
|
|
|
# multiplie collaborators per item can be specified
|
2024-12-02 08:53:27 +02:00
|
|
|
|
2025-09-22 18:20:21 +03:00
|
|
|
/.devops/*.Dockerfile @ngxson
|
2025-10-06 17:40:21 +02:00
|
|
|
/.github/actions/ @slaren @CISC
|
2025-09-22 18:20:21 +03:00
|
|
|
/.github/workflows/ @CISC
|
|
|
|
|
/.github/workflows/release.yml @slaren
|
|
|
|
|
/.github/workflows/winget.yml @slaren
|
|
|
|
|
/ci/ @ggerganov
|
|
|
|
|
/cmake/ @ggerganov
|
|
|
|
|
/common/CMakeLists.txt @ggerganov
|
|
|
|
|
/common/arg.* @ggerganov @ericcurtin
|
|
|
|
|
/common/base64.hpp.* @ggerganov
|
|
|
|
|
/common/build-info.* @ggerganov
|
|
|
|
|
/common/common.* @ggerganov
|
|
|
|
|
/common/console.* @ggerganov
|
2025-10-01 19:22:18 +02:00
|
|
|
/common/http.* @angt
|
2025-09-22 18:20:21 +03:00
|
|
|
/common/llguidance.* @ggerganov
|
|
|
|
|
/common/log.* @ggerganov
|
|
|
|
|
/common/sampling.* @ggerganov
|
|
|
|
|
/common/speculative.* @ggerganov
|
|
|
|
|
/convert_*.py @CISC
|
|
|
|
|
/examples/batched.swift/ @ggerganov
|
|
|
|
|
/examples/batched/ @ggerganov
|
|
|
|
|
/examples/convert-llama2c-to-ggml/ @ggerganov
|
|
|
|
|
/examples/deprecation-warning/ @ggerganov
|
|
|
|
|
/examples/diffusion/ @am17an
|
|
|
|
|
/examples/embedding/ @ggerganov
|
|
|
|
|
/examples/eval-callback/ @ggerganov
|
|
|
|
|
/examples/export-docs/ @ggerganov
|
|
|
|
|
/examples/gen-docs/ @ggerganov
|
|
|
|
|
/examples/gguf/ @ggerganov
|
|
|
|
|
/examples/llama.android/ @ggerganov
|
|
|
|
|
/examples/llama.swiftui/ @ggerganov
|
|
|
|
|
/examples/llama.vim @ggerganov
|
|
|
|
|
/examples/lookahead/ @ggerganov
|
|
|
|
|
/examples/lookup/ @JohannesGaessler
|
2025-09-23 08:13:22 +02:00
|
|
|
/examples/model-conversion/ @danbev
|
2025-09-22 18:20:21 +03:00
|
|
|
/examples/parallel/ @ggerganov
|
|
|
|
|
/examples/passkey/ @ggerganov
|
|
|
|
|
/examples/retrieval/ @ggerganov
|
|
|
|
|
/examples/save-load-state/ @ggerganov
|
|
|
|
|
/examples/simple-chat/ @slaren
|
|
|
|
|
/examples/simple/ @slaren
|
|
|
|
|
/examples/speculative-simple/ @ggerganov
|
|
|
|
|
/examples/speculative/ @ggerganov
|
|
|
|
|
/ggml/cmake/ @ggerganov
|
|
|
|
|
/ggml/include/ @ggerganov @slaren
|
|
|
|
|
/ggml/src/ggml-alloc.c @slaren
|
|
|
|
|
/ggml/src/ggml-backend* @slaren
|
|
|
|
|
/ggml/src/ggml-blas/ @slaren
|
|
|
|
|
/ggml/src/ggml-common.h @ggerganov @slaren
|
|
|
|
|
/ggml/src/ggml-cpu/ @ggerganov @slaren
|
2025-09-29 22:50:44 +08:00
|
|
|
/ggml/src/ggml-cpu/spacemit/ @alex-spacemit
|
2025-09-22 18:20:21 +03:00
|
|
|
/ggml/src/ggml-cuda/common.cuh @slaren
|
|
|
|
|
/ggml/src/ggml-cuda/fattn* @JohannesGaessler
|
|
|
|
|
/ggml/src/ggml-cuda/ggml-cuda.cu @slaren
|
2025-10-19 15:37:12 +08:00
|
|
|
/ggml/src/ggml-cuda/mmf.* @JohannesGaessler @am17an
|
2025-09-22 18:20:21 +03:00
|
|
|
/ggml/src/ggml-cuda/mmq.* @JohannesGaessler
|
|
|
|
|
/ggml/src/ggml-cuda/mmvf.* @JohannesGaessler
|
|
|
|
|
/ggml/src/ggml-cuda/mmvq.* @JohannesGaessler
|
2025-10-02 05:52:59 +02:00
|
|
|
/ggml/src/ggml-cuda/fattn-wmma* @IMbackK
|
|
|
|
|
/ggml/src/ggml-hip/ @IMbackK
|
|
|
|
|
/ggml/src/ggml-cuda/vendors/hip.h @IMbackK
|
2025-09-22 18:20:21 +03:00
|
|
|
/ggml/src/ggml-impl.h @ggerganov @slaren
|
|
|
|
|
/ggml/src/ggml-metal/ @ggerganov
|
2025-09-29 22:30:16 -07:00
|
|
|
/ggml/src/ggml-opencl/ @lhez @max-krasnyansky
|
Add experimental ggml-hexagon backend for the Hexagon NPU (#16547)
* model: add support for extra bufs for all devices
* hexagon: add experimental ggml-hexagon backend for the Hexagon NPU
This commit introduces a new experimental backend `ggml-hexagon` with support for the Hexagon NPU.
Highlights:
- Supports Hexagon versions: v73, v75, v79, and v81
- Targets Android devices based on Snapdragon SoCs: Gen3, 8-Elite, and 8-Elite Gen5
- Supports Q4_0, Q8_0, MXFP4, and FP32 data types
- Implements core LLM ops: MUL_MAT/MUL_MAT_ID, ADD/SUB/MUL/ADD_ID, RMS_NORM, ROPE, GLU/SWIGLU, SOFTMAX
**Note:** This backend is experimental and may exhibit instability or limited performance across supported devices.
It is intended for early testing and feedback from llama.cpp/ggml developer and user community.
Co-Authored-By: Rajdeep Ganguly <rganguly@qti.qualcomm.com>
Co-Authored-By: Todor Boinovski <todorb@qti.qualcomm.com>
* hexagon: fix format checker errors
* hexagon: update readme and cmake presets
* ci: add android-ndk-build jobs that build plain ARM64 and Snapdragon versions
* hexagon: add simple graph optimizer for stacking MUL_MAT ops with the same input
* hexagon: move ADB helper scripts into scripts/snapdragon/adb
* hexagon: replace all f/printfs with GGML_LOG_...
* readme: add hexagon to the list supported backends
* hexagon: stack malmuts with quantized inputs only
* hexagon: add TODO for fixing issues in hexagon_graph_optimize
* hexagon: update to hex-sdk 6.4.0 and add scripts for running on QDC
* scripts: fix lint errors
* scripts: update qdc pytest script to make linter happy
* hexagon: add reduce sum in fp32
* hexagon: reduce number of vector stores in matmul output
* hexagon: remove the need for vdelta in reduce-multiply-x8
* hexagon: consistent use of reduce_sum_fp32 for row_sums
* hexagon: some more matmul optimizations and comments
Optimize cases where tensor dims are not multiple of 1024 (e.g in Qwen models).
We've handled those cases already but at a higher overhead.
* hexagon: update cmake presets
* hexagon: add OPMASK support for run-bench.sh wrapper
* hexagon: update to use GGML_BACKEND_API
* hexagon: remove unused logic for setting tensor flags for the views
* hexagon: add asserts to set/get_tensor to make sure we handle complete tensors
Same asserts as the CPU backend.
* hexagon: use cpy_tensor slow path for non-host buffers
* hexagon: error checks in the buffer allocator
* cmake: move include(extProj) under ggml-hexagon
* hexagon: don't forget to delete the backend on free
* hexagon: set/get_tensor size assert apply only to quantized tensors
* hexagon: reintroduce HEX_VERBOSE wrapper for GGML_LOG_DEBUG for now
GGML_LOG_DEBUG is always enabled for test-backend-ops and the output gets in the way.
Ideally we need a bit more finer log levels.
* docs: typos in hexagon developer docs (libggm-...)
* hexagon: overhaul error handling in the session/device allocation
this should handle all failure paths in the session allocation.
* hexagon: update cmake presets to enable fp16 vectors
* hexagon: remove unused time_usec function
* hexagon: don't forget to release buffer contexts
* hexagon: fixed indents in hvx-utils (missed clang-format auto-format failure)
* hexagon: remove custom can_repeat function and use ggml_can_repeat
---------
Co-authored-by: Rajdeep Ganguly <rganguly@qti.qualcomm.com>
Co-authored-by: Todor Boinovski <todorb@qti.qualcomm.com>
2025-10-22 13:47:09 -07:00
|
|
|
/ggml/src/ggml-hexagon/ @max-krasnyansky
|
2025-09-22 18:20:21 +03:00
|
|
|
/ggml/src/ggml-opt.cpp @JohannesGaessler
|
|
|
|
|
/ggml/src/ggml-quants.* @ggerganov
|
2025-09-26 16:09:34 +03:00
|
|
|
/ggml/src/ggml-rpc/ @rgerganov
|
2025-09-22 18:20:21 +03:00
|
|
|
/ggml/src/ggml-threading.* @ggerganov @slaren
|
|
|
|
|
/ggml/src/ggml-vulkan/ @0cc4m
|
2025-10-07 13:48:56 -07:00
|
|
|
/ggml/src/ggml-webgpu/ @reeselevine
|
2025-09-25 13:06:30 +08:00
|
|
|
/ggml/src/ggml-zdnn/ @taronaeo @Andreas-Krebbel @AlekseiNikiforovIBM
|
2025-09-22 18:20:21 +03:00
|
|
|
/ggml/src/ggml.c @ggerganov @slaren
|
|
|
|
|
/ggml/src/ggml.cpp @ggerganov @slaren
|
|
|
|
|
/ggml/src/gguf.cpp @JohannesGaessler @Green-Sky
|
|
|
|
|
/gguf-py/ @CISC
|
|
|
|
|
/media/ @ggerganov
|
|
|
|
|
/scripts/gen* @ggerganov
|
|
|
|
|
/scripts/get* @ggerganov
|
|
|
|
|
/scripts/sync* @ggerganov
|
|
|
|
|
/src/ @ggerganov
|
|
|
|
|
/src/llama-adapter.* @CISC
|
|
|
|
|
/src/llama-arch.* @CISC
|
|
|
|
|
/src/llama-chat.* @ngxson
|
|
|
|
|
/src/llama-graph.* @CISC
|
|
|
|
|
/src/llama-model-loader.* @slaren
|
|
|
|
|
/src/llama-model.* @CISC
|
|
|
|
|
/src/llama-vocab.* @CISC
|
|
|
|
|
/tests/ @ggerganov
|
|
|
|
|
/tests/test-backend-ops.cpp @slaren
|
|
|
|
|
/tests/test-thread-safety.cpp @slaren
|
|
|
|
|
/tools/batched-bench/ @ggerganov
|
|
|
|
|
/tools/llama-bench/ @slaren
|
|
|
|
|
/tools/main/ @ggerganov
|
|
|
|
|
/tools/mtmd/ @ngxson
|
|
|
|
|
/tools/perplexity/ @ggerganov
|
|
|
|
|
/tools/quantize/ @ggerganov
|
2025-09-26 16:09:34 +03:00
|
|
|
/tools/rpc/ @rgerganov
|
2025-09-22 18:20:21 +03:00
|
|
|
/tools/run/ @ericcurtin
|
|
|
|
|
/tools/server/* @ngxson @ggerganov @ericcurtin # no subdir
|
|
|
|
|
/tools/server/webui/ @allozaur
|
|
|
|
|
/tools/tokenize/ @ggerganov
|
|
|
|
|
/tools/tts/ @ggerganov
|
|
|
|
|
/vendor/ @ggerganov
|
2025-09-24 08:10:09 +02:00
|
|
|
/.clang-format @slaren
|
|
|
|
|
/.clang-tidy @slaren
|
|
|
|
|
/AUTHORS @ggerganov
|
|
|
|
|
/CMakeLists.txt @ggerganov
|
|
|
|
|
/CONTRIBUTING.md @ggerganov
|
|
|
|
|
/LICENSE @ggerganov
|
|
|
|
|
/README.md @ggerganov
|
|
|
|
|
/SECURITY.md @ggerganov
|
2025-09-26 07:53:36 +02:00
|
|
|
/build-xcframework.sh @danbev
|
2025-09-24 08:53:20 +02:00
|
|
|
requirements*.txt @CISC
|