xiezhongtao 8d3f9b9cb1 fix(ggml-cuda): 修正CUDA编译标志和WARP_SIZE配置
更新CUDA编译标志以使用正确的fast-math和extended-lambda选项
调整WARP_SIZE为64以适配目标硬件
移除-Wmissing-noreturn警告选项
修复cudaStreamWaitEvent调用缺少参数的问题
2026-01-23 16:42:43 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00
2026-01-23 11:34:20 +08:00

enginex-bi_150-llama.cpp

运行于【天数智芯-天垓150】算力卡的【文本生成】引擎基于 llama.cpp (b7516) 引擎进行架构特别适配优化。

Build Docker Image

docker build -t enginex-iluvatar/iluvatar-llama.cpp:b7516-bi150 .

最新镜像git.modelhub.org.cn:9443/enginex-iluvatar/iluvatar-llama.cpp:b7516-bi150

Description
运行于【天数智芯-天垓150】算力卡的【文本生成】引擎,基于 llama.cpp 引擎进行架构特别适配优化。
Readme MIT 26 MiB
Languages
C++ 56.1%
C 12.6%
Python 7.9%
Cuda 6.5%
HTML 4.6%
Other 12.2%