R0CKSTAR
9b8f3c6c77
musa: fix build warnings (unused variable) ( #14869 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-07-26 10:36:02 +08:00
Sigbjørn Skjæret
e28c0b80c2
cuda : implement bf16 cpy ops and enable bf16 cont ( #14763 )
...
* implement bf16 cpy ops and enable bf16 cont
* deduplicate copy functions
* deduplicate checks
2025-07-22 12:33:10 +02:00
Aman Gupta
f9a31eea06
CUDA: set_rows + cpy.cu refactor ( #14712 )
2025-07-18 14:54:18 +08:00
R0CKSTAR
cbc68be51d
cuda: fix build warnings in set-rows.cu (unused variable) ( #14687 )
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com >
2025-07-15 15:28:53 +08:00
Sigbjørn Skjæret
923e3ea2e3
cuda : add set rows for bf16 ( #14664 )
2025-07-13 15:01:24 +02:00
Aman Gupta
7de5c7cab6
CUDA: add set rows for f32 and f16 ( #14551 )
...
* CUDA: add set rows for f32 and f16
* Review: change kernel params, use strides from host
* Use 1-d kernel
* Review: use int64_t for blockDim.x, rename nb->s for clarity
2025-07-12 16:31:38 +03:00