[Misc] Clean m.def and add Development Tips (#4890)

This commit is contained in:
yinfan98
2025-03-30 14:06:18 +08:00
committed by GitHub
parent 54b9a2de0a
commit 0d7fe866f9
3 changed files with 86 additions and 159 deletions

View File

@@ -51,6 +51,47 @@ Steps to add a new kernel:
4. Update [CMakeLists.txt](https://github.com/sgl-project/sglang/blob/main/sgl-kernel/CMakeLists.txt) to include new CUDA source
5. Expose Python interface in [python](https://github.com/sgl-project/sglang/blob/main/sgl-kernel/python/sgl_kernel)
### Development Tips
1. When implementing kernels in [csrc](https://github.com/sgl-project/sglang/tree/main/sgl-kernel/csrc), only define pure CUDA files and C++ interfaces. If you need to use `Torch::tensor`, use `<torch/all.h>` instead of `<torch/extension.h>`. Using `<torch/extension.h>` will cause compilation errors when using SABI.
2. When creating torch extensions, simply add the function definition with `m.def`:
```cpp
m.def("register_graph_buffers", register_graph_buffers);
```
3. When exposing Python interfaces, avoid using kwargs in C++ interface kernels.
**Avoid this:**
```cpp
torch.ops.sgl_kernel.apply_rope_pos_ids_cos_sin_cache.default(
q=query.view(query.shape[0], -1, head_size),
k=key.view(key.shape[0], -1, head_size),
q_rope=query.view(query.shape[0], -1, head_size),
k_rope=key.view(key.shape[0], -1, head_size),
cos_sin_cache=cos_sin_cache,
pos_ids=positions.long(),
interleave=(not is_neox),
cuda_stream=get_cuda_stream(),
)
```
**Use this instead:**
```cpp
torch.ops.sgl_kernel.apply_rope_pos_ids_cos_sin_cache.default(
query.view(query.shape[0], -1, head_size),
key.view(key.shape[0], -1, head_size),
query.view(query.shape[0], -1, head_size),
key.view(key.shape[0], -1, head_size),
cos_sin_cache,
positions.long(),
(not is_neox),
get_cuda_stream(),
)
```
### Build & Install
Development build: