fix(ggml-cuda): 修正CUDA编译标志和WARP_SIZE配置

更新CUDA编译标志以使用正确的fast-math和extended-lambda选项调整WARP_SIZE为64以适配目标硬件移除-Wmissing-noreturn警告选项修复cudaStreamWaitEvent调用缺少参数的问题
2026-01-23 16:42:43 +08:00
parent b1cf23ae3e
commit 8d3f9b9cb1
5 changed files with 19 additions and 9 deletions
--- a/README.md
+++ b/README.md
@@ -1,3 +1,11 @@
 # enginex-bi_150-llama.cpp

 运行于【天数智芯-天垓150】算力卡的【文本生成】引擎，基于 llama.cpp (b7516) 引擎进行架构特别适配优化。
+
+## Build Docker Image
+
+```bash
+docker build -t enginex-iluvatar/iluvatar-llama.cpp:b7516-bi150 .
+```
+
+最新镜像：git.modelhub.org.cn:9443/enginex-iluvatar/iluvatar-llama.cpp:b7516-bi150