metal : extend mat-mat multiplication support (#16225)

* metal : support mul_mm with src1->type == GGML_TYPE_F16

* metal : support mul_mm_id with src1->type == GGML_TYPE_F16

[no ci]

* metal : mul_mm support ne00 % 32 != 0

* metal : support mul_mm_id with ne00 % 32 != 0

* cont : remove unnecessary unrolls

* cont : simplify data loading

* metal : optimize mul_mm when output bounds checks are not needed
This commit is contained in:
Georgi Gerganov
2025-09-28 09:34:44 +03:00
committed by GitHub
parent 3b53634fe3
commit 6a2c6145a0
7 changed files with 247 additions and 120 deletions

View File

@@ -717,8 +717,7 @@ bool ggml_metal_device_supports_op(ggml_metal_device_t dev, const struct ggml_te
return true;
case GGML_OP_MUL_MAT:
case GGML_OP_MUL_MAT_ID:
return has_simdgroup_reduction &&
(op->src[0]->type != GGML_TYPE_F32 || op->src[1]->type == GGML_TYPE_F32);
return has_simdgroup_reduction;
case GGML_OP_CPY:
case GGML_OP_DUP:
case GGML_OP_CONT: