enginex-ascend-910-llama.cpp/ggml-vulkan at 7ecd780b1a1d5214b8d04c25ebfc194d310816ed - enginex-ascend-910-llama.cpp - Gitea: Git with a cup of tea

EngineX-Ascend/enginex-ascend-910-llama.cpp

Files

History

Jeff Bolz 7ecd780b1a vulkan: Use fp16 for the flash attention P*V multiplication (#12783 )

This is consistent with the ggml-cuda behavior and the mul_mat fallback.

2025-04-09 07:12:57 +02:00

..

cmake: fix ggml-shaders-gen compiler paths containing spaces (#12747 )

2025-04-04 10:12:40 -03:00

vulkan: Use fp16 for the flash attention P*V multiplication (#12783 )

2025-04-09 07:12:57 +02:00

CMakeLists.txt

vulkan: Fix missing cmake logic for dot product extension (#12721 )

2025-04-03 10:08:26 -05:00

ggml-vulkan.cpp

vulkan: Use unclamped loads for flash attention mask (#12720 )

2025-04-06 10:47:13 +02:00