enginex-ascend-910-llama.cpp/set-rows.cuh at 5eff6ec9b1220b599a43b594b1110487ab6aca08 - enginex-ascend-910-llama.cpp - Gitea: Git with a cup of tea

EngineX-Ascend/enginex-ascend-910-llama.cpp

Files

Aman Gupta 7de5c7cab6 CUDA: add set rows for f32 and f16 (#14551 )

* CUDA: add set rows for f32 and f16

* Review: change kernel params, use strides from host

* Use 1-d kernel

* Review: use int64_t for blockDim.x, rename nb->s for clarity

2025-07-12 16:31:38 +03:00

8 lines

155 B

Plaintext

Raw Blame History

 #pragma once
 #include "common.cuh"
 #define CUDA_SET_ROWS_BLOCK_SIZE 256
 void ggml_cuda_op_set_rows(ggml_backend_cuda_context & ctx, ggml_tensor * dst);