[Doc] Polish sgl-kernel readme for cu126 build error (#8704)
This commit is contained in:
@@ -58,7 +58,15 @@ And if you build the sgl-kernel with cmake, you need to add `CMAKE_BUILD_PARALLE
|
|||||||
CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) python -m uv build --wheel -Cbuild-dir=build --color=always .
|
CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) python -m uv build --wheel -Cbuild-dir=build --color=always .
|
||||||
```
|
```
|
||||||
|
|
||||||
### FlashAttention on Hopper
|
### ⚠️ Compilation Issue with `sgl-kernel` and CUDA 12.6
|
||||||
|
|
||||||
|
When compiling `sgl-kernel` with FlashAttention on a Hopper GPU using CUDA 12.6, you may encounter a segmentation fault:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kernel/build/_deps/repo-flash-attention-src/hopper/instantiations/flash_fwd_hdimall_bf16_paged_softcap_sm90.cu -o CMakeFiles/flash_ops.dir/_deps/repo-flash-attention-src/hopper/instantiations/flash_fwd_hdimall_bf16_paged_softcap_sm90.cu.o
|
||||||
|
Segmentation fault (core dumped)
|
||||||
|
```
|
||||||
|
|
||||||
⚠️ **Note**: To ensure that FlashAttention compiles correctly on Hopper GPU Architecture(sm90), it is strongly [recommended](https://github.com/Dao-AILab/flash-attention/issues/1453) to use:
|
⚠️ **Note**: To ensure that FlashAttention compiles correctly on Hopper GPU Architecture(sm90), it is strongly [recommended](https://github.com/Dao-AILab/flash-attention/issues/1453) to use:
|
||||||
- nvcc version: 12.6
|
- nvcc version: 12.6
|
||||||
- ptxas version: 12.8
|
- ptxas version: 12.8
|
||||||
|
|||||||
Reference in New Issue
Block a user