enginex-ascend-910-llama.cpp

EngineX-Ascend/enginex-ascend-910-llama.cpp

Files

Jeff Bolz e56abd2098 vulkan: Implement topk_moe fused shader, ported from CUDA (#16641 )

This is similar to the CUDA shader from #16130, but doesn't use shared memory
and handles different subgroup sizes.

2025-10-18 12:22:57 +02:00

ggml-blas

sync : whisper.cpp (ggml/1359)

2025-09-29 17:43:58 +03:00

ggml-cann

CANN: format code using .clang-format (#15863 )

2025-10-16 16:41:11 +08:00

ggml-cpu

ggml : fix SpaceMit IME array out-of-bounds in task assignment (#16629 )

2025-10-17 13:01:23 +03:00

ggml-cuda

CUDA: use registers instead of smem in topk-moe (#16647 )

2025-10-18 11:52:53 +02:00

ggml-hip

CUDA: faster tile FA, add oob checks, more HSs (#16492 )

2025-10-11 20:54:32 +02:00

ggml-metal

metal : add CONV_TRANSPOSE_2D (#16542 )

2025-10-17 09:33:58 +03:00

ggml-musa

CUDA: faster tile FA, add oob checks, more HSs (#16492 )

2025-10-11 20:54:32 +02:00

ggml-opencl

opencl: transposed gemm/gemv moe kernel with mxfp4,f32 (#16602 )

2025-10-17 17:55:32 -07:00

ggml-rpc

rpc : report actual free memory (#16616 )

2025-10-17 18:02:52 +03:00

ggml-sycl

SYCL SET operator optimized for F32 tensors (#16350 )

2025-10-17 10:36:40 +08:00

ggml-vulkan

vulkan: Implement topk_moe fused shader, ported from CUDA (#16641 )

2025-10-18 12:22:57 +02:00

ggml-webgpu

ggml webgpu: profiling, CI updates, reworking of command submission (#16452 )

2025-10-07 13:48:56 -07:00

ggml-zdnn

zdnn: refactor codebase + add docs (#16178 )

2025-09-23 14:53:05 +08:00

CMakeLists.txt

cmake : Dont define XOPENSOURCE on AIX (#16481 )

2025-10-10 11:15:46 +03:00

ggml-alloc.c

ggml : fix graph reallocation with multiple chunks (#16396 )

2025-10-03 13:49:08 +02:00

ggml-backend-impl.h

rpc : add support for multiple devices (#16276 )

2025-10-04 12:49:16 +03:00

ggml-backend-reg.cpp

ggml-backend : add root cause in error message if loading backend library fails (#16172 )

2025-09-29 13:17:09 +02:00

ggml-backend.cpp

llama: print memory breakdown on exit (#15860 )

2025-09-24 16:53:48 +02:00

ggml-common.h

llama : add gpt-oss (#15091 )

2025-08-05 22:10:36 +03:00

ggml-impl.h

vulkan: Implement topk_moe fused shader, ported from CUDA (#16641 )

2025-10-18 12:22:57 +02:00

ggml-opt.cpp

finetune: SGD optimizer, more CLI args (#13873 )

2025-08-14 12:03:57 +02:00

ggml-quants.c

ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (#15928 )

2025-09-23 10:25:20 +02:00

ggml-quants.h

llama : add gpt-oss (#15091 )

2025-08-05 22:10:36 +03:00

ggml-threading.cpp

ggml : build backends as libraries (#10256 )

2024-11-14 18:04:35 +01:00

ggml-threading.h

remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )

2024-12-12 19:02:49 +01:00

ggml.c

cpu : add FLOOR, CEIL, ROUND and TRUNC unary operators (#16083 )

2025-10-15 21:24:51 +02:00

ggml.cpp

ggml : Print backtrace on uncaught C++ exceptions (ggml/1232)

2025-06-01 13:43:57 +03:00

gguf.cpp

gguf: gguf_writer refactor (#15691 )

2025-09-05 11:34:28 +02:00