sglang/sgl_kernel at 8e09b370777cd65e8769d5ec7921edb5219f3118 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

Xiaoyu Zhang 8e09b37077 Sgl kernel fused_moe_gate support n_shared_experts (#5440 )

2025-04-17 23:05:15 -07:00

..

__init__.py

kernel: support slightly faster merge_state_v2 cuda kernel (#5381 )

2025-04-14 21:28:23 -07:00

allreduce.py

sgl-kernel transfer custom allreduce from trt kernel to vllm kernel (#5079 )

2025-04-05 14:23:20 -07:00

attention.py

BLackwell cutlass mla: Add check for bad page size/block num combinations (#5431 )

2025-04-15 14:07:42 -07:00

elementwise.py

[Feat] Update sgl-kernel flashinfer to latest main version (#5500 )

2025-04-17 12:43:23 -07:00

flash_attn.py

Add flash_attn_varlen_func to sgl-kernel (#5315 )

2025-04-11 23:36:36 -07:00

gemm.py

fix: remove cublas_grouped_gemm (#5307 )

2025-04-11 16:22:37 -07:00

moe.py

Sgl kernel fused_moe_gate support n_shared_experts (#5440 )

2025-04-17 23:05:15 -07:00

sampling.py

[Feat] Update sgl-kernel flashinfer to latest main version (#5500 )

2025-04-17 12:43:23 -07:00

sparse_flash_attn.py

[Feat] Add sparse attn to sgl-kernel (#5327 )

2025-04-12 11:36:36 -07:00

speculative.py

use default for torch.ops (#4835 )

2025-03-27 19:09:58 -07:00

utils.py

Rename files in sgl kernel to avoid nested folder structure (#4213 )

2025-03-08 22:54:51 -08:00

version.py

chore: bump sgl-kernel v0.0.9.post1 (#5430 )

2025-04-15 11:00:21 -07:00