sglang/sgl_kernel at 08acdb5c3db3eef90d40ae0b7389d5fb604eae73 - sglang - Gitea: Git with a cup of tea

EngineX-Hygon/sglang

Files

History

PGFLMG 08acdb5c3d [Feat] Scale up fa3 kernel to sm8x arch (#5912 )

Co-authored-by: zhyncs <me@zhyncs.com>

2025-04-30 13:59:36 -07:00

..

__init__.py

[FEATURE] Enhance platform compatibility for ARM (#5746 )

2025-04-29 15:06:16 -07:00

allreduce.py

sgl-kernel transfer custom allreduce from trt kernel to vllm kernel (#5079 )

2025-04-05 14:23:20 -07:00

attention.py

Add Cutlass MLA attention backend (#5390 )

2025-04-27 20:58:53 -07:00

elementwise.py

[Feat] Update sgl-kernel flashinfer to latest main version (#5500 )

2025-04-17 12:43:23 -07:00

flash_attn.py

[Feat] Scale up fa3 kernel to sm8x arch (#5912 )

2025-04-30 13:59:36 -07:00

gemm.py

fix: remove cublas_grouped_gemm (#5307 )

2025-04-11 16:22:37 -07:00

grammar.py

fix sgl-kernel unit tests (#5666 )

2025-04-23 01:18:30 -07:00

moe.py

[1/2] Add FP8 Blockscale MoE CUTLASS kernel for Blackwell (#5281 )

2025-04-22 22:28:20 -07:00

sampling.py

Fix sampler nan check when calling top_k_top_p_sampling_from_probs (#5546 )

2025-04-19 21:47:23 -07:00

sparse_flash_attn.py

[Feat] QWen-1M context support[1/2]: Update block sparse attention backend utils kernel (#5847 )

2025-04-28 11:03:17 -07:00

speculative.py

use default for torch.ops (#4835 )

2025-03-27 19:09:58 -07:00

utils.py

Rename files in sgl kernel to avoid nested folder structure (#4213 )

2025-03-08 22:54:51 -08:00

version.py

chore: bump sgl-kernel 0.1.0 (#5688 )

2025-04-23 14:23:59 -07:00