From ac6962ccd6b2bfe62cb115ed08bab0552338dbf6 Mon Sep 17 00:00:00 2001 From: PGFLMG <1106310035@qq.com> Date: Sat, 2 Aug 2025 17:03:07 +0800 Subject: [PATCH] [Doc] Polish sgl-kernel readme for cu126 build error (#8704) --- sgl-kernel/README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/sgl-kernel/README.md b/sgl-kernel/README.md index c71b92335..72491433a 100644 --- a/sgl-kernel/README.md +++ b/sgl-kernel/README.md @@ -58,7 +58,15 @@ And if you build the sgl-kernel with cmake, you need to add `CMAKE_BUILD_PARALLE CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) python -m uv build --wheel -Cbuild-dir=build --color=always . ``` -### FlashAttention on Hopper +### ⚠️ Compilation Issue with `sgl-kernel` and CUDA 12.6 + +When compiling `sgl-kernel` with FlashAttention on a Hopper GPU using CUDA 12.6, you may encounter a segmentation fault: + +```bash +kernel/build/_deps/repo-flash-attention-src/hopper/instantiations/flash_fwd_hdimall_bf16_paged_softcap_sm90.cu -o CMakeFiles/flash_ops.dir/_deps/repo-flash-attention-src/hopper/instantiations/flash_fwd_hdimall_bf16_paged_softcap_sm90.cu.o +Segmentation fault (core dumped) +``` + ⚠️ **Note**: To ensure that FlashAttention compiles correctly on Hopper GPU Architecture(sm90), it is strongly [recommended](https://github.com/Dao-AILab/flash-attention/issues/1453) to use: - nvcc version: 12.6 - ptxas version: 12.8