enginex-ascend-910-llama.cpp/CODEOWNERS

# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs
# multiplie collaborators per item can be specified

/.devops/*.Dockerfile                   @ngxson
/.github/actions/                       @slaren @CISC
/.github/workflows/                     @CISC
/.github/workflows/release.yml          @slaren
/.github/workflows/winget.yml           @slaren
/ci/                                    @ggerganov
/cmake/                                 @ggerganov
/common/CMakeLists.txt                  @ggerganov
/common/arg.*                           @ggerganov @ericcurtin
/common/base64.hpp.*                    @ggerganov
/common/build-info.*                    @ggerganov
/common/common.*                        @ggerganov
/common/console.*                       @ggerganov
/common/http.*                          @angt
/common/llguidance.*                    @ggerganov
/common/log.*                           @ggerganov
/common/sampling.*                      @ggerganov
/common/speculative.*                   @ggerganov
/convert_*.py                           @CISC
/examples/batched.swift/                @ggerganov
/examples/batched/                      @ggerganov
/examples/convert-llama2c-to-ggml/      @ggerganov
/examples/deprecation-warning/          @ggerganov
/examples/diffusion/                    @am17an
/examples/embedding/                    @ggerganov
/examples/eval-callback/                @ggerganov
/examples/export-docs/                  @ggerganov
/examples/gen-docs/                     @ggerganov
/examples/gguf/                         @ggerganov
/examples/llama.android/                @ggerganov
/examples/llama.swiftui/                @ggerganov
/examples/llama.vim                     @ggerganov
/examples/lookahead/                    @ggerganov
/examples/lookup/                       @JohannesGaessler
/examples/model-conversion/             @danbev
/examples/parallel/                     @ggerganov
/examples/passkey/                      @ggerganov
/examples/retrieval/                    @ggerganov
/examples/save-load-state/              @ggerganov
/examples/simple-chat/                  @slaren
/examples/simple/                       @slaren
/examples/speculative-simple/           @ggerganov
/examples/speculative/                  @ggerganov
/ggml/cmake/                            @ggerganov
/ggml/include/                          @ggerganov @slaren
/ggml/src/ggml-alloc.c                  @slaren
/ggml/src/ggml-backend*                 @slaren
/ggml/src/ggml-blas/                    @slaren
/ggml/src/ggml-common.h                 @ggerganov @slaren
/ggml/src/ggml-cpu/                     @ggerganov @slaren
/ggml/src/ggml-cpu/spacemit/            @alex-spacemit
/ggml/src/ggml-cuda/common.cuh          @slaren
/ggml/src/ggml-cuda/fattn*              @JohannesGaessler
/ggml/src/ggml-cuda/ggml-cuda.cu        @slaren
/ggml/src/ggml-cuda/mmf.*               @JohannesGaessler @am17an
/ggml/src/ggml-cuda/mmq.*               @JohannesGaessler
/ggml/src/ggml-cuda/mmvf.*              @JohannesGaessler
/ggml/src/ggml-cuda/mmvq.*              @JohannesGaessler
/ggml/src/ggml-cuda/fattn-wmma*         @IMbackK
/ggml/src/ggml-hip/                     @IMbackK
/ggml/src/ggml-cuda/vendors/hip.h       @IMbackK
/ggml/src/ggml-impl.h                   @ggerganov @slaren
/ggml/src/ggml-metal/                   @ggerganov
/ggml/src/ggml-opencl/                  @lhez @max-krasnyansky
/ggml/src/ggml-hexagon/                 @max-krasnyansky
/ggml/src/ggml-opt.cpp                  @JohannesGaessler
/ggml/src/ggml-quants.*                 @ggerganov
/ggml/src/ggml-rpc/                     @rgerganov
/ggml/src/ggml-threading.*              @ggerganov @slaren
/ggml/src/ggml-vulkan/                  @0cc4m
/ggml/src/ggml-webgpu/                  @reeselevine
/ggml/src/ggml-zdnn/                    @taronaeo @Andreas-Krebbel @AlekseiNikiforovIBM
/ggml/src/ggml.c                        @ggerganov @slaren
/ggml/src/ggml.cpp                      @ggerganov @slaren
/ggml/src/gguf.cpp                      @JohannesGaessler @Green-Sky
/gguf-py/                               @CISC
/media/                                 @ggerganov
/scripts/gen*                           @ggerganov
/scripts/get*                           @ggerganov
/scripts/sync*                          @ggerganov
/src/                                   @ggerganov
/src/llama-adapter.*                    @CISC
/src/llama-arch.*                       @CISC
/src/llama-chat.*                       @ngxson
/src/llama-graph.*                      @CISC
/src/llama-model-loader.*               @slaren
/src/llama-model.*                      @CISC
/src/llama-vocab.*                      @CISC
/tests/                                 @ggerganov
/tests/test-backend-ops.cpp             @slaren
/tests/test-thread-safety.cpp           @slaren
/tools/batched-bench/                   @ggerganov
/tools/llama-bench/                     @slaren
/tools/main/                            @ggerganov
/tools/mtmd/                            @ngxson
/tools/perplexity/                      @ggerganov
/tools/quantize/                        @ggerganov
/tools/rpc/                             @rgerganov
/tools/run/                             @ericcurtin
/tools/server/*                         @ngxson @ggerganov @ericcurtin # no subdir
/tools/server/webui/                    @allozaur
/tools/tokenize/                        @ggerganov
/tools/tts/                             @ggerganov
/vendor/                                @ggerganov
/.clang-format                          @slaren
/.clang-tidy                            @slaren
/AUTHORS                                @ggerganov
/CMakeLists.txt                         @ggerganov
/CONTRIBUTING.md                        @ggerganov
/LICENSE                                @ggerganov
/README.md                              @ggerganov
/SECURITY.md                            @ggerganov
/build-xcframework.sh                   @danbev
requirements*.txt                       @CISC
contrib : refresh (#10593) * contrib : refresh * contrib : expand [no ci] * contrib : expand test-backend-ops instructions * contrib : add CODEOWNERS * prs : update template to not have checkbox [no ci] 2024-12-02 08:53:27 +02:00			`# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`# multiplie collaborators per item can be specified`
contrib : refresh (#10593) * contrib : refresh * contrib : expand [no ci] * contrib : expand test-backend-ops instructions * contrib : add CODEOWNERS * prs : update template to not have checkbox [no ci] 2024-12-02 08:53:27 +02:00
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/.devops/*.Dockerfile @ngxson`
ci : refactor sdk caching to minimize storage (#16414) * refactor sdk caching to minimize storage * use correct action * add myself as owner to /.github/actions/ [no ci] 2025-10-06 17:40:21 +02:00			`/.github/actions/ @slaren @CISC`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/.github/workflows/ @CISC`
			`/.github/workflows/release.yml @slaren`
			`/.github/workflows/winget.yml @slaren`
			`/ci/ @ggerganov`
			`/cmake/ @ggerganov`
			`/common/CMakeLists.txt @ggerganov`
			`/common/arg.* @ggerganov @ericcurtin`
			`/common/base64.hpp.* @ggerganov`
			`/common/build-info.* @ggerganov`
			`/common/common.* @ggerganov`
			`/common/console.* @ggerganov`
common: introduce http.h for httplib-based client (#16373) * common: introduce http.h for httplib-based client This change moves cpp-httplib based URL parsing and client setup into a new header `common/http.h`, and integrates it in `arg.cpp` and `run.cpp`. It is an iteration towards removing libcurl, while intentionally minimizing changes to existing code to guarantee the same behavior when `LLAMA_CURL` is used. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * tools : add missing WIN32_LEAN_AND_MEAN Signed-off-by: Adrien Gallouët <adrien@gallouet.fr> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co> Signed-off-by: Adrien Gallouët <adrien@gallouet.fr> 2025-10-01 19:22:18 +02:00			`/common/http.* @angt`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/common/llguidance.* @ggerganov`
			`/common/log.* @ggerganov`
			`/common/sampling.* @ggerganov`
			`/common/speculative.* @ggerganov`
			`/convert_*.py @CISC`
			`/examples/batched.swift/ @ggerganov`
			`/examples/batched/ @ggerganov`
			`/examples/convert-llama2c-to-ggml/ @ggerganov`
			`/examples/deprecation-warning/ @ggerganov`
			`/examples/diffusion/ @am17an`
			`/examples/embedding/ @ggerganov`
			`/examples/eval-callback/ @ggerganov`
			`/examples/export-docs/ @ggerganov`
			`/examples/gen-docs/ @ggerganov`
			`/examples/gguf/ @ggerganov`
			`/examples/llama.android/ @ggerganov`
			`/examples/llama.swiftui/ @ggerganov`
			`/examples/llama.vim @ggerganov`
			`/examples/lookahead/ @ggerganov`
			`/examples/lookup/ @JohannesGaessler`
codeowners : add @danbev to model-conversion example [no ci] (#16190) This commit adds examples/model-conversion/ to the CODEOWNERS file and assigns myself (@danbev) as the code owner for this directory. 2025-09-23 08:13:22 +02:00			`/examples/model-conversion/ @danbev`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/examples/parallel/ @ggerganov`
			`/examples/passkey/ @ggerganov`
			`/examples/retrieval/ @ggerganov`
			`/examples/save-load-state/ @ggerganov`
			`/examples/simple-chat/ @slaren`
			`/examples/simple/ @slaren`
			`/examples/speculative-simple/ @ggerganov`
			`/examples/speculative/ @ggerganov`
			`/ggml/cmake/ @ggerganov`
			`/ggml/include/ @ggerganov @slaren`
			`/ggml/src/ggml-alloc.c @slaren`
			`/ggml/src/ggml-backend* @slaren`
			`/ggml/src/ggml-blas/ @slaren`
			`/ggml/src/ggml-common.h @ggerganov @slaren`
			`/ggml/src/ggml-cpu/ @ggerganov @slaren`
ggml: riscv: add riscv spacemit backend (#15288) * ggml: add spacemit backend Change-Id: I249bdc043485d815a9c351867137bc1e27cc2e23 * add new line at end of file Change-Id: I889ed1c85fb45e62350ecde0c06f70450cadfbe2 * add riscv zba extension limit Change-Id: I321eb200f859751727afe5cae13074dfce2bb0ce * fixed for review comments, file renamed and format Change-Id: Ia20b6ec24a36638e62e0fe07cf100916a7cce3ce * fixed for code format, after clang-format Change-Id: I5dc33a0412da3d3f2d77075d8939185d3009eca2 * use _Float16 instead of __fp16 Change-Id: I039fb02bb95270e641bc4442204e658735859d43 * add ci for riscv64-spacemit-ime-native Change-Id: I711c1033061df1a289ea77891b2997599dfe8279 * update debian-13-riscv64-spacemit-ime-native ci label Change-Id: Ifb2b891e2fca57b5da604fce2ac255f27731179a * remove license comment for spacemit ime Change-Id: If0dc3ca30a958631ccca0a28b62e0b825f9fb0c3 * upgrade binutils for gcc ime Change-Id: Ibf2fa74c1064408974cb5b45f044d40987e5fb45 * add spacemit ime cross jobs Change-Id: I80d74909941d41cb9cd09e51d8baf01c985cbfc6 * remove native compile for riscv64-spacemit-ime Change-Id: I01920afafdc73fa7424014fd648d243f8ec9e25e * ci : add caching for spacemit ime cross toolchain Change-Id: Ic54a192019a2fd982bbd58225ce3bbc38f4053de * ci: bug fixed for cache path and env Change-Id: I28c42e10b6fff053bb6580926ca2353448cb042a * Update .github/workflows/build-linux-cross.yml for cache path Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * bugfixed for build-linux-cross.yml, syntax error Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: cailinxi <linxi.cai@spacemit.com> Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> 2025-09-29 22:50:44 +08:00			`/ggml/src/ggml-cpu/spacemit/ @alex-spacemit`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/ggml/src/ggml-cuda/common.cuh @slaren`
			`/ggml/src/ggml-cuda/fattn* @JohannesGaessler`
			`/ggml/src/ggml-cuda/ggml-cuda.cu @slaren`
CODEOWNERS: update for ggml-cuda/mmf (#16660) 2025-10-19 15:37:12 +08:00			`/ggml/src/ggml-cuda/mmf.* @JohannesGaessler @am17an`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/ggml/src/ggml-cuda/mmq.* @JohannesGaessler`
			`/ggml/src/ggml-cuda/mmvf.* @JohannesGaessler`
			`/ggml/src/ggml-cuda/mmvq.* @JohannesGaessler`
HIP: add IMbackK to codeowner (#16375) 2025-10-02 05:52:59 +02:00			`/ggml/src/ggml-cuda/fattn-wmma* @IMbackK`
			`/ggml/src/ggml-hip/ @IMbackK`
			`/ggml/src/ggml-cuda/vendors/hip.h @IMbackK`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/ggml/src/ggml-impl.h @ggerganov @slaren`
			`/ggml/src/ggml-metal/ @ggerganov`
codeowners: add codeowners for opencl backend (#16344) 2025-09-29 22:30:16 -07:00			`/ggml/src/ggml-opencl/ @lhez @max-krasnyansky`
Add experimental ggml-hexagon backend for the Hexagon NPU (#16547) * model: add support for extra bufs for all devices * hexagon: add experimental ggml-hexagon backend for the Hexagon NPU This commit introduces a new experimental backend `ggml-hexagon` with support for the Hexagon NPU. Highlights: - Supports Hexagon versions: v73, v75, v79, and v81 - Targets Android devices based on Snapdragon SoCs: Gen3, 8-Elite, and 8-Elite Gen5 - Supports Q4_0, Q8_0, MXFP4, and FP32 data types - Implements core LLM ops: MUL_MAT/MUL_MAT_ID, ADD/SUB/MUL/ADD_ID, RMS_NORM, ROPE, GLU/SWIGLU, SOFTMAX Note: This backend is experimental and may exhibit instability or limited performance across supported devices. It is intended for early testing and feedback from llama.cpp/ggml developer and user community. Co-Authored-By: Rajdeep Ganguly <rganguly@qti.qualcomm.com> Co-Authored-By: Todor Boinovski <todorb@qti.qualcomm.com> * hexagon: fix format checker errors * hexagon: update readme and cmake presets * ci: add android-ndk-build jobs that build plain ARM64 and Snapdragon versions * hexagon: add simple graph optimizer for stacking MUL_MAT ops with the same input * hexagon: move ADB helper scripts into scripts/snapdragon/adb * hexagon: replace all f/printfs with GGML_LOG_... * readme: add hexagon to the list supported backends * hexagon: stack malmuts with quantized inputs only * hexagon: add TODO for fixing issues in hexagon_graph_optimize * hexagon: update to hex-sdk 6.4.0 and add scripts for running on QDC * scripts: fix lint errors * scripts: update qdc pytest script to make linter happy * hexagon: add reduce sum in fp32 * hexagon: reduce number of vector stores in matmul output * hexagon: remove the need for vdelta in reduce-multiply-x8 * hexagon: consistent use of reduce_sum_fp32 for row_sums * hexagon: some more matmul optimizations and comments Optimize cases where tensor dims are not multiple of 1024 (e.g in Qwen models). We've handled those cases already but at a higher overhead. * hexagon: update cmake presets * hexagon: add OPMASK support for run-bench.sh wrapper * hexagon: update to use GGML_BACKEND_API * hexagon: remove unused logic for setting tensor flags for the views * hexagon: add asserts to set/get_tensor to make sure we handle complete tensors Same asserts as the CPU backend. * hexagon: use cpy_tensor slow path for non-host buffers * hexagon: error checks in the buffer allocator * cmake: move include(extProj) under ggml-hexagon * hexagon: don't forget to delete the backend on free * hexagon: set/get_tensor size assert apply only to quantized tensors * hexagon: reintroduce HEX_VERBOSE wrapper for GGML_LOG_DEBUG for now GGML_LOG_DEBUG is always enabled for test-backend-ops and the output gets in the way. Ideally we need a bit more finer log levels. * docs: typos in hexagon developer docs (libggm-...) * hexagon: overhaul error handling in the session/device allocation this should handle all failure paths in the session allocation. * hexagon: update cmake presets to enable fp16 vectors * hexagon: remove unused time_usec function * hexagon: don't forget to release buffer contexts * hexagon: fixed indents in hvx-utils (missed clang-format auto-format failure) * hexagon: remove custom can_repeat function and use ggml_can_repeat --------- Co-authored-by: Rajdeep Ganguly <rganguly@qti.qualcomm.com> Co-authored-by: Todor Boinovski <todorb@qti.qualcomm.com> 2025-10-22 13:47:09 -07:00			`/ggml/src/ggml-hexagon/ @max-krasnyansky`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/ggml/src/ggml-opt.cpp @JohannesGaessler`
			`/ggml/src/ggml-quants.* @ggerganov`
codeowners : add rgerganov as owner of RPC [no ci] (#16279) 2025-09-26 16:09:34 +03:00			`/ggml/src/ggml-rpc/ @rgerganov`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/ggml/src/ggml-threading.* @ggerganov @slaren`
			`/ggml/src/ggml-vulkan/ @0cc4m`
ggml webgpu: profiling, CI updates, reworking of command submission (#16452) * Add profiling * More detailed profiling * Rework command submission to avoid global locks * Update wait handling * try new method of waiting on futures * Add serializing of command submission in some cases * Add new pool for timestamp queries and clean up logging * Serialize command submission in CI and leave a TODO note * Update webgpu CI * Add myself as WebGPU codeowner * Deadlock avoidance * Leave WebGPU/Vulkan CI serialized * Fix divide by 0 * Fix logic in division by inflight_threads * Update CODEOWNERS and remove serialize submit option 2025-10-07 13:48:56 -07:00			`/ggml/src/ggml-webgpu/ @reeselevine`
codeowners: add ownership of zdnn backend [no ci] (#16232) add @Andreas-Krebbel to owners of zDNN backend Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> 2025-09-25 13:06:30 +08:00			`/ggml/src/ggml-zdnn/ @taronaeo @Andreas-Krebbel @AlekseiNikiforovIBM`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/ggml/src/ggml.c @ggerganov @slaren`
			`/ggml/src/ggml.cpp @ggerganov @slaren`
			`/ggml/src/gguf.cpp @JohannesGaessler @Green-Sky`
			`/gguf-py/ @CISC`
			`/media/ @ggerganov`
			`/scripts/gen* @ggerganov`
			`/scripts/get* @ggerganov`
			`/scripts/sync* @ggerganov`
			`/src/ @ggerganov`
			`/src/llama-adapter.* @CISC`
			`/src/llama-arch.* @CISC`
			`/src/llama-chat.* @ngxson`
			`/src/llama-graph.* @CISC`
			`/src/llama-model-loader.* @slaren`
			`/src/llama-model.* @CISC`
			`/src/llama-vocab.* @CISC`
			`/tests/ @ggerganov`
			`/tests/test-backend-ops.cpp @slaren`
			`/tests/test-thread-safety.cpp @slaren`
			`/tools/batched-bench/ @ggerganov`
			`/tools/llama-bench/ @slaren`
			`/tools/main/ @ggerganov`
			`/tools/mtmd/ @ngxson`
			`/tools/perplexity/ @ggerganov`
			`/tools/quantize/ @ggerganov`
codeowners : add rgerganov as owner of RPC [no ci] (#16279) 2025-09-26 16:09:34 +03:00			`/tools/rpc/ @rgerganov`
codeowners : update + cleanup (#16174) --------- Co-authored-by: slaren <slarengh@gmail.com> 2025-09-22 18:20:21 +03:00			`/tools/run/ @ericcurtin`
			`/tools/server/* @ngxson @ggerganov @ericcurtin # no subdir`
			`/tools/server/webui/ @allozaur`
			`/tools/tokenize/ @ggerganov`
			`/tools/tts/ @ggerganov`
			`/vendor/ @ggerganov`
codeowners : use slash prefix for root files [no ci] (#16210) This commit adds a leading slash to the paths of root-level files in the CODEOWNERS file. The motivation for this is that these might otherwise match files in subdirectories that have other/additional owners will override them. Refs: https://github.com/ggml-org/llama.cpp/pull/16209#issuecomment-3326434274 2025-09-24 08:10:09 +02:00			`/.clang-format @slaren`
			`/.clang-tidy @slaren`
			`/AUTHORS @ggerganov`
			`/CMakeLists.txt @ggerganov`
			`/CONTRIBUTING.md @ggerganov`
			`/LICENSE @ggerganov`
			`/README.md @ggerganov`
			`/SECURITY.md @ggerganov`
codeowners : add danbev as owner of build-xcframework.sh [no ci] (#16268) 2025-09-26 07:53:36 +02:00			`/build-xcframework.sh @danbev`
codeowners : match all requirements files (#16214) 2025-09-24 08:53:20 +02:00			`requirements*.txt @CISC`